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THE ASSOCIATION FACTOR IN INTELLIGENCE 
TESTING* 


S. TOLANSKY 
Fellow of Armstrong College, Newcastle-upon-Tyne 
INTRODUCTION 


A good deal of work has been done on the reliability of intelligence 
tests and the constancy of IQ but very little on the analysis of response 
error. Holzinger' considers a response error (6) as due to fluctuations 
in effort, emotional status, concentration, etc. He finds that this is 
roughly normally distributed. Stenquist? observes that, if the Terman 
tests are repeated on a group, seven per cent are twenty points out, 
and this is probably due to response error. An account is given in this 
paper of an investigation on the association factor in intelligence 
tesfs. The tests selected were from the American Army a Test, 
admittedly somewhat old. Instead of allowing, say, the usual two 
minutes for a test, a modified technique was employed. The test 
was covered up. The first question uncovered and the time taken 
to answer it was noted with a stop watch reading to one-tenth second. 
The next question was uncovered and the time taken. This procedure 
was repeated to the end of the test. Very little time was spent in 
covering and uncovering and the timing errors cancel out. Only 
those questions done in the time stipulated in the test are considered 
for the score. By graphing times against the number of the question 
it is strikingly shown that certain questions (different as a rule for each 
person) give abnormal times. There are three possible causes of 
delayed response to a question. They are: (1) Ignorance of facts; 





*I wish to thank Mr. Vernon Brown for assistance in the preparation of this 
paper. 
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(2) inability or partial inability to answer, through weakness jn 
reasoning; (3) delayed reaction due to an emotional association. 

In a good intelligence test (1) is reduced to a minimum and the 
cause (2), which is that concerned with intelligence, becomes important. 
However (3) cannot be neglected by any means. Jung* has shown in 
his word association experiments, that words involving an emotional 
factor take a longer time for a response than do words which do not 
arouse such a factor. I have found as a result of experience that a 
delay up to ten seconds in ordinary word association is sufficiently 
common to be considered normal. Suppose then that in a test, which 
has not been completed in the stipulated time, there are words or 
ideas which call up associations emotionally toned, then, irrespective 
of intelligence, delays will take place over these. The net result is to 
reduce the IQ, for had the time not been uselessly absorbed more 
questions could have been answered. There are, of course, associations 
to every question. These can be classified into the useful, helping to 
solve the problem, and the useless, of the Freudian or Jung type. If 
all the questions have been completed in the time, then the delays do 
not affect the number attempted. The possibility of useless associa- 
tion is purely accidental, depending on previous experience, and this 
may partly account for disagreements in reliability and IQ constancy 
observations. 


EXPERIMENTAL METHOD 


Two tests were selected, a true-false test and a multiple choice 
test. Both were given to five women and five men college students. 
The reaction times were taken and the testees were carefully watched. 
The questions taking abnormally long time were selected and the 
testee’s attention drawn to them. In some the delay was easily 
explained. The testees immediately admitted lack of knowledge, or 
actual difficulty. The others, which predominate, delay was due to 
useless association. The questions were dealt with individually 
and a regular psychoanalysis carried out. In every case it was found 
that the cause of the delay was due to associations of emotionally toned 
complexes, some going back many years, some sexual, and some being 
repressed incidents of an unpleasant nature. In practically every 
case the examinee was hardly aware that the time taken was long, yet 
in some instances a change from an average of six seconds to a delayed 
time of seventeen seconds took place. It was often easy to tell from 
facial expression that the emotions were disturbed. 
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It has been considered advisable to reproduce the two tests since 


frequent reference will be made to them. To illustrate the effect of 
useless association two tests have been selected at random from the 
twenty and the detailed analysis given. The results of the twenty 
tests are shown in Tables I and III and conclusions drawn from them 
are recorded. 


10. 


11. 


12. 


13. 


14, 


Description of Tests. 
Test I 


Type, multiple choice. 

Sixteen questions are to be answered. 

Time allowed—ninety seconds. 

Subtest. 

It is wiser to put money aside and not spend it all so that you may: (a) Pre- 
pare for old age and sickness; (b) collect all the different kinds of money; 
(c) gamble if you wish. 


. Shoes are made of leather because: (a) It is tanned; (b) it is tough, pliable and 


warm; (c) it can be blackened. 


. Why do soldiers wear wrist watches rather than pocket watches? Because: 


(a) They keep better time; (b) they are harder to break; (c) they are handier. 


. The main reason why stone is used for building purposes is because: (a) 


It makes a good appearance; (b) it is strong and lasting; (c) it is heavy. 


. Why is beef better food than cabbage? Because: (a) It tastes better; (b) it is 


more nourishing; (c) it is harder to obtain. 

If someone does you a favour what should you do? (a) Try to forget it; 
(b) steal for him if he asks you to; (c) return the favour. 

If you do not get a letter from home which you know was written, it may be 
because: (a) It was lost in the mails; (b) you forgot to tell your people to 
write; (c) the postal service has been discontinued. 

The main thing farmers do is to: (a) Supply luxuries; (b) make work for the 
unemployed; (c) feed the nation. 

If a man who can’t swim should fall into a river he should: (a) Yell for help 
and try to scramble out; (b) dive to the bottom and crawl out; (c) lie on his 
back and float. 

Glass insulators are used to fasten telegraph wires because: (a) The glass 
keeps the pole from being burned; (b) the glass keeps the current from 
escaping; (c) the glass is cheap and attractive. 

If your load of coal gets stuck in the mud what should you do? (a) Leave it 
there; (b) get more horses or men to pull it out; (c) throw off the load. 

Why are criminals locked up? (a) To protect society; (b) to get even with 
them; (c) to make them work. 

Why should a married man have his life insured? Because: (2) Death may 
come at any time; (b) insurance companies are usually honest; (c) his family 
will not then suffer if he dies. . 

In leap years February has twenty-nine days because: (a) February is a 
short month; (b) Some people are born on February 29; (c) otherwise the 
calendar would not come out right. 
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15. If you are held up and robbed in a strange city you should: (a) Apply to the 
police for help; (b) ask the first man you meet for money; (c) borrow some 
money at a bank. 

16. Why should we have Congressmen? Because: (a) The people must be ruled; 
(b) it ensures truly representative government; (c) the people are too many 
to meet and make their laws. 


EXPERIMENTAL RESULTS 


In a test of this nature a variation in time for each subtest is 
expected because of the varying times required for reading, etc. 
This, however, is systematic and affects each person to a roughly equal 
degree. Different rates of reading in different persons will seriously 















































TaBLeE | 
(Time Allowed—Ninety Seconds) 

Problem A B C D E F G H I J 
1 10.0* 7.2|}9.0} 5.2 | 5.4] 8.8] 15.0 | 4.8 | 13.0 | 15.1 
2 6.2 | 4.0/3.7) 4.0/3.8} 3.8) 7.2|2.0| 6.8} 4.0 
3 7.0} 4.3|3.9} 5.0! 5.6! 6.3 | 12.5*| 3.0%) 5.9! 6.3 
4 5.6 | 4.2|3.9| 4016.0; 7.0] 7.6/2.5 | 4.2| 4.6 
5 4.7|6.7|3.9|}2.0}5.8! 7.0} 6.0|2.5| 4.1] 3.3 
6 5.0 | 6.0 | 3.9|3.0|3.4| 7.4] 5.0/3.0] 5.0/] 6.0 
7 9.2*| 9.8*| 6.2*/13.5*| 7.4 | 8.2*) 17.4%) 4.2%) 8.0*| 6.0 
8 4.8) 3.0/3.0) 3.0)}4.0|} 3.9] 6.1/3.2) 5.6] 6.0 
9 6.2 |} 6.5) 5.2 |} 5.0} 7.0*| 6.0} 6.4/3.8) 6.4] 7.0 
10 9.0 | 6.4) 8.8} 4.0!}6.0;| 6.5| 8.4]3.8) 7.0! 8.4* 
11 5.4 | 6.2 | 5.0 |12.0*| 4.4 | 14.0*| 10.0*] 4.0*| 17.2*| 6.4 
12 4.2|}3.8) 4.1/3.0) 3.5| 5.0] 6.0]3.5)| 4.4| 6.0 
13 7.8*| 8.0*| 6.8*| 5.0|6.5] 7.5; 7.813.8| 5.2) 8.0* 
14 11.6*] 5.2 | 8.0| 4.2 | 4.5] 6.5 | 15.0] 4.0} 7.4] 6.0 
15 6.015.2/}5.0|5.0}4.6] 5.0}; 8.012.8] 5.0] 6.8 
16 17.2*| 7.8*| 5.0 | 9.0*118.5*) 11.0*| 12.0*110.0 | 15.0*} 7.4* 

Total time in | | 
seconds....|120 (94 |85 (87 (|96 114 150 (61 108 107 
Mark score 
obtained...| 11 1315 13 15 10 8 14 13 11 






































affect the result, which is a bad fault in the test. The times taken 
by the ten people A to J are givenin TableI. The subtests completed 
in ninety seconds by each individual are shown above the black line. 
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The total times are given to the nearest second and below this the 
score obtained. A full analysis will follow later. The asterisk indi- 
cates useless association, the other long times are accounted for under 
headings (1) and (2) at the beginning of this article. 

The full analyses for A will now be given. A is a woman graduate 
twenty-two years old. It is convenient to classify the first part of 
each question as the statement and the three choices as I, II, III. 


1. Problem (1) takes 10 seconds. Statement causes the following associations: 
“ Foresight—insurance—doctor—illness—child—insurance.”’ A’s sister is ill and 
the doctor called today. He is the doctor of a lady friend who married an ailing 
husband. The husband died recently and the doctor had asked how the wife 
existed and was told by A that the husband was heavily insured (see question). 
A child was born to the lady during the last illness of the father. A was surprised 
and cynical for the lady claimed she had no sex knowledge and confided in A that 
she did not sleep with her husband, a point which struck A very forcibly. 

2. Problem (7) takes 9.2 seconds and choice II gave “‘lack of letters—urging 
at home—disappointment with mother—upbraiding.’”’ Two years ago A was in 
France for some months and her mother hardly ever wrote her although her sisters 
did. She was disappointed and complained on returning. She often blames her 
mother for failure to write. 

3. Problem (13) takes 7.8 seconds, the word “‘insured” giving, ‘‘insurance- 
agent—wife—peroxide—Peter.” There was a long pause after this. A said 
“Oh dash!”’ when the word “‘insurance-agent” slipped out. That day her 
mother told her she had seen an insurance-agent who had married a fair-haired 
girl and now had a child, Peter. Some years ago A thought this man keen on her 
(note the depreciating, perhaps jealous, remark of ‘‘peroxide”’ referring to the 
wife’s fair hair). A said she was pleased he was now settled as he is a bit of a 
philanderer. She was a little too anxious in disclaiming anything but friendly 
regard. 

4. Problem (14) takes a long time, 11.6 seconds, choice III gives two distinct 
trains of association: 

1. “Reform of Calendar—Julius Caesar—month of August (named after 
Augustus)—Harvest Festival in August.” A stopped and related the 
following. She remembered when three years old falling off a seat in 
church during a harvest festival. Someone picked her up and threatened 
to steal her from her parents. She even remembers the dress she was 
wearing at the time. 

2. ‘Change in Calendar—St. Cuthbert’s Church—vicar—scurvy trick.” 
At the above church A had taken a prominent part in social life. On 
removing she had neglected to say farewell to the vicar and thought this 
very bad. The vicar was very conservative and would certainly object to 
the recent proposal for fixing the date of Easter. 

5. The last question took an abnormal time—17 seconds. Choice II caused a 

whole group of associations connected with the fact that A believes present day 
government is entirely undemocratic. 
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This concludes the analysis. Altogether thirteen questions were 
done in time. This is indicated by the line at thirteen in Table I. 
Table II, A gives a detailed result for testee A. Useless association 
is indicated by the figure 1 in Tables II and IV, and where the time 


















































TABLE II 
| | | | Totals 
Problem | A B\|\C|D|\E | F\G | H| 1 | J |\——— 
| ae | i1| 2] 
i | a 
1 1 aa = oe | 
: | | | ee 
3 | fall fol (ae he i} 2/| 3 
4 | | i ae | 
: od P | {| | | 
: Ls | sd | | 
7 34 tk FS }1 {a- {1 j1 8 | 2 
. | Pot | | 
9 | is | 4 eee: 
10 2 2 | - ee? ce. es 
11 Ps, ER 1/1 {1/1 “) ao 
12 | | ; | | | 
13 2 fa. ja. | 2 veal Ge te an 
14 1 2 | 2-|2 | 2 1; 4) 1 
15 | | ee | 
16 1/1 pa [a [t-] 2] a ft ft | 8] 1 e 
nd Ce | 5 | 3 | 2 2/13 |3 {4 3 13 | | nel 
| eee eae te 2 2 }2 {a [i | | 








is long, due to the absence of useful association only, this is shown 
by a figure 2.* When the question has been wrongly answered a period 
is given. Thus 1- indicates a wrong answer and useless association. 
A thought problem (10) was of a technical nature, was afrajd, and 
guessed, wrongly. Eleven marks were scored. The average normal 
time for questions, other than those marked by a 1, is five seconds. 
The five type 1 associations taking the long time of sixty-four seconds, 
i.e., thirty-nine above the average. Since thirty seconds more than 
the time allowed was taken, it is obvious that but for the useless 
association A would have completed the whole test, and obtained three 
more marks. Thirty-one per cent of the questions which are emotion- 





* Referred to as types 1 and 2 later. 
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ally toned take seventy-one per cent of the time allowed. From the 
analysis given there is obviously no doubting the strength of the 
emotional tone and its real effect. As seen from the table, A is not 
the most extreme case. 

The second test, which was given to the same ten people, is as 
follows. It is the true-false type. Sentences had to be arranged into 
sense and marked true or false. There are twenty-four questions 
and one hundred twenty seconds are allowed. A mark is deducted 
for each error, as usual. 


Test Il 


1. Lions strong are. 

2. Houses people in live. 

3. Days there in a week eight are. 

4. Leg flies one have only. 

5. Months coldest are summer the. 

6. Gotten sea water sugar is from. 

7. Honey bees flowers gather the from. 

8. And eat good gold silver to are. 

9. President Columbus first the was America of. 
10. Making is bread valuable wheat for. 
11. Water and made are butter from cheese. 
12. Sides every has four triangle. 
13. Every times makes mistakes pevson at. 
14. Many toes fingers as men as have. 
15. Not eat gunpowder to good is. 
16. Ninety canal ago built Panama years was the. 
17. Live dangerous is near a volcano to it. 
18. Clothing worthless are and for wool cotton. 
19. As sheets are napkins used never. 
20. People trusted intemperate be always can. 
21. Employ debaters irony never. 
22. Certain some death of mean kinds sickness. 
23. Envy bad malice traits and are. 
24. Repeated call for courtesies associations. 


Full results are given in Tables III and IV. Full analysis is given 
for C who has the best mark and finished in one hundred seven seconds. 
As usual, a mark is subtracted from the total for every error, to correct 
for guessing. C is a science student (man) aged twenty. 


1. Problem (10) takes 5.4 seconds. Associations were ‘‘cornfield—fresh air— 
house—removal.” Five years ago C had to leave his house, the family going to a 
country house next to what was then a cornfield. He is now of the opinion that the 
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change was an advantage, but was doubtful then. 
at going to live in the country. 

2. Problem (19) takes 5.4 seconds, giving ‘‘napkin—set table—Christmas— 
spilled wine.’’ Some years ago he had spilled a glass of wine with his napkin at a 
Christmas dinner at home. Christmas is the only occasion on which he gets wine. 
It was not replaced; he thought it very wasteful, ruined a tablecloth, and the 


He recalled his mixed feelings 



























































TABLE III 
(Time Allowed—One Hundred Twenty Seconds) 

Problem A B C D E F G H I | J 
1 5.0 | 2.4 3.0 | 1.7 | 3.0 | 2.0 4.0/1.9 2.4 | 4.1 

2 3.7 | 2.4 1.9 | 2.2) 1.7 | 2.0 3.0 | 1.0 3.0 | 2.2 

3 4.3 | 2.0 2.2/1.4 | 2.0 | 2.4 3.3 | 2.4% 4.5% 2.2 
4 4.3 | 2.2 2.2 | 2.4| 2.1 | 2.2 4.8*| 2.0 4.0 | 4.0 
5 4.0/2.4] 2.2|1.4/2.1)3.0| 2.8/2.2] 3.5/3.5 

6 3.1 | 2.8 2.2} 1.9 | 2.0 | 2.8 3.0 | 2.5 5.0*| 2.2 
7 3.3 | 2.4.| 3.0} 2.2/2.0] 3.2 3.0|2.5| 4.7 | 2.8 
8 12.0*| 2.5 3.2 | 2.0 | 3.1 | 5.2%) 2.4] 4.5* 10.0% 2.6 

9 2.6 | 3.0 2.8 | 1.6 | 4.0 | 2.5 2.8 | 2.0 3.0 | 4.0 

10 3.5 | 1.4 5.4*| 1.6 | 9.0*) 2.2 2.6 | 2.2 3.0 | 3.0 
11 3.8 |10.5 4.0 | 2.0} 6.0 | 3.9%} 3.0 | 5.0 | 10.2*| 2.1 
12 3.2 | 1.9 2.6 | 1.6) 1.9 | 2.5 2.8 | 2.0 2.1 | 2.8 
13 12.5 | 3.7 2.7 9.0* 2.0 | 2.6% 3.0 | 4.7 | 3.9 | 4.0 
14 3.6 | 3.1 5.4 | 2.4) 3.0 | 5.0*| 6.4*| 6.0*, 4.0) 3.0 
15 3.1 | 1.6 2.0 | 1.4 | 2.4 | 2.0 2.9 | 2.7 3.8 | 2.0 
16 6.2 | 3.0 | 12.0 | 3.1 | 2.4] 3.8 9.4| 4.8 5.6 | 3.1 
17 2.0} 1.9 2.2|1.41,1.9 | 2.2 2.9 | 4.0 2.4 | 2.9 
18 5.0 | 2.0 4.0; 1.6 | 3.0 |} 2.8} 11.4*| 3.0 4.0 | 3.2 
19 16.4*| 4.4*| 5.4*| 3.0*, 2.6 | 3.0*| 5.9 | 4.0 4.0 | 6.0* 
20 6.4* 2.9 6.6 | 2.4 | 2.6 | 5.0% 7.4* 3.6 3.0 | 3.8 
21 6.7*| 4.0%, 8.8 | 2.4 | 5.0*| 2.4 5.2 | 3.2 3.4 | 3.1 
22 4.0] 3.4 4.7 | 2.0} 2.4 | 4.0*| 5.0* 4.8% 4.1) 4.6* 
23 4.5] 3.2 8.5*| 2.2 | 2.4 | 3.0 3.7*| 3.5 5.0 | 3.8 
24 3.7 | 8.0 9.2*| 2.0! 2.2 | 8.2*| 7.0* 17.0%) 21.0%) 4.2 

Total time... .|129 77 107 53 68 78 108 91 120 80 

Mark obtained! 21 18 22 16 18 18 14 18 22 =18 














whole family laughed at him. He vividly recalls the incident. The mental 
disturbance in the test was strong since he also answered wrongly, much to his 
surprise, on later reading his answers. 

3. Problem (23) takes 8.5 seconds giving ‘‘bad temper—teacher—blameless.” 
Three years ago C was a pupil teacher in a school. 


The class teacher was bad 


tempered, often striking children who were blameless. This affected him a good 
deal. 
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4. The last problem takes 9.2 seconds. Associations are ‘‘ Nurse—sister— 
mother.”” Last summer his sister cut her foot in the street on some glass. A 
nurse living nearby had tended it. Last month the nurse fell ill and C’s mother 


repaid the kindness by tending her for three weeks. C thinks the nurse fully 
deserves this courtesy. 





















































TABLE IV 

| | | | | Totals 

Problems A)/B\|C|D|\|E\|F\GiH| I) J) 7 
| | | Sears 

| | | 

| 

1 | | | | | 
2 | | | Sort ae A a 
3 | | | thot Shee] 
4 rey | 1 Bee ea 1 
5 | | 2S SS ae 
6 | | Piso. | 
7 | | | 
. | 1 wg Sal eris . ae 
9 Ps 2 | ra he ee 
10 re 1 1 | £2 eS ee © sae 
il 1../2 )2 2132 2 jacy..] a] 8] 2 
12 | | | | bed 
13 | 2 1 1: | -{2 | 2] 2) 8 
14 Pa: 2 ete ee ee ee ee) 
15 “= oe es 155 Ou OR oa 
16 | 2 2 }2)2 12 )2 ].. | 6| 6 

17 | | 72 oe i 

18 ve ‘. be SBT Daw Dice | BY 2 
19 1 |i. |1-]1 aS) 3 foe her Se eee 
20 1 2 Ph Ee Leek wet a Oe et 
21 ‘1 11 |2 1. | a cect it OE ee ft 
22 Pe “i }1 ]1 af..faelal..da 
23 | 1 ae | oe PRE thas he | BP | 
24 : $i3 Deo be EO 1 fa |..| 5} a] 
Totall........|4 |2 |4 |2 |3 a si5i5 1/2) | | 
Total 2........ 2 [2/8 pi |2 |2 2 {1/2 | 











There were four type 2 questions. In (14) C wondered whether 
or not thumbs were included as fingers and stopped some time. In 
(16) he took twelve seconds. He did not know the answer but guessed 


rightly. In (20) and (21) he had to pause and consider before deciding 
whether they were true or false. 
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These two fully worked examples are illustrative of all. Very 
profound associations were met with in some of the tests. J, whose 
average is about four, takes twenty-one seconds over the last question. 
His mind, he said, was a perfect blank. Analysis revealed a strong 
sexual complex. No person completely escaped the effect. 


DIscussION OF RESULTS 


Tables II and IV yield valuable conclusions. Consider Table II. 
The totals for each individual are at the foot and for each question 
at the side. It is obvious that problems 7, 11, 13, and 16 have a very 
strong tendency to produce useless association. Eight out of ten 
persons have type 1 association in question 7. The mean time is 
8.8 seconds, while for the next question it is only 3.9 seconds. Refer- 
ence to the list of questions shows these problems are almost certain 
to bring emotionally toned responses to anyone. Their content is 
such as probably to affect the average person; therefore this type 
of question, ought not to be included in the test, since it results in 
errors. This method then acts as a selective agent in improving tests. 
There are altogether thirty-one of type 1 and twelve of type 2 associ- 
ations. Fourteen mistakes were made (neglecting problem 9) and of 
these nine are associated with type 1 and four with type 2 delays. 
It was often found that a strong disturbance resulted in a wrong answer 
as well as delayed response. There is very little tendency to per- 
severation. This may be connected with the fact that a break occurs 
between each problem. Problem 9 is an amusing reflection on the 
authors of the test. Seven persons gave wrong answers; they insisted 
that the answer given in the book was wrong; the times were not 
abnormal. In this test a certain amount of latitude has been allowed 
for different rates of reading. G only gets eight marks, but he is 
a very slow reader, only finishing ten. 

Now consider Table IV. Problem (19) is obviously unsuitable 
since seven people show associations of type 1 and there are six wrong 
replies. Question (24) is unsuitable and (8) and (22) are doubtful. 
(11) is of interest since there are five associations of the type 2 and as 
seen the question is actually somewhat more difficult than the rest for 
townspeople (but not for country people). Problem (16) has six 
of the type 2 associations and six wrong answers. This is due to 
absence of knowledge since the examinees are English and the question 
refers to America. In this test there are six and in Table II there are 
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eight questions which give no abnormalities of any description, 7.e., 
every one does them fairly quickly and gets them right. These are 
apparently simple questions and the response time is small for each. 
A sprinkling of type 2 is desirable, helping to eliminate the less capable. 
Nineteen wrong answers were given (if problem (16), which referred 
to America, is neglected). Of these eleven are type 1 associations and 
one type 2. 


CONCLUSION 


It is apparent from a general survey of the tables that the associa- 
tion factor is of real significance in affecting the rate of reply in a test. 
In all four hundred questions were set. Of these one-hundred four 
take long times. Seventy-one of these are due to type 1 and thirty- 
three to type 2 associations. There is a marked tendency for wrong 
replies to occur with type 1 delays. Thus thirty-three wrong replies 
are made, twenty occurring with type 1 delays, and five with type 2. 
In many cases there is hesitation at the beginning of the test, the first 
question taking a long time. This is natural, and was not taken as a 
case of delay. Six out of the ten persons would have answered more 
questions—and probably earned more marks—had it not been for 
useless associations. Anderson‘ has shown that speed of response in 
word association diminishes very markedly as the age decreases from 
fourteen years to eight years. It is thus quite probable that the delay 
effects will become worse with tests applied to younger children. 
These delays are present in all mental testing and must affect scores 
and consequently correlations. 

The method employed can be made use of as a weapon in the 
improvement of a test, by selection. While the test used is not one 
of the later improved types, the conclusions are in no way invalidated. 
For any test which contains similar ideas to these (and most do) will 
produce associations. The results indicate that an attempt should be 
made to draw up tests, in which a tendency to associate will be improb- 
able. To show the effectiveness of this method of examining tests 
a curve is drawn in figure 1 with the results of J and J superposed 
(Table IV). J has made no errors due to type 1 and has twenty-two 
marks, J’s time is eighty and I’s time one hundred twenty seconds. 
The curves are fairly parallel if the type 1 delays are not counted. 
Thus a drop from (4) to (5), a rise from (15) to (16), then a drop to 
(17), following with a rise to (18) is common to both curves. It is 
obvious that J and J are of about the same intelligence and actually 
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if not for type 1 errors both would get twenty-two marks. Questions 
(8), (10) and (24) are of especial interest, showing how one person 
takes three and four seconds in a normal response, the other taking 
ten and twenty seconds in an abnormal response. Emotionally toned 
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association, then, can act in two ways: (1) It can reduce the number 
of questions done; (2) it can produce wrong answers by disturbing 
the emotions. | 

To test the effect on correlation, a group of experiments is now 
being conducted in conjunction with Mr. Vernon Brown. Tests 
are being given to a hundred children. A group of ordinary tests, 
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involving verbalism and likely to give associations is being given 
and their correlation determined. To the same children a group of 
the “illiterate” type test, largely geometrical in nature, is also being 
given, and their correlation determined. Since association is less 
likely here, the correlation is expected to be higher. Results will be 
communicated later. 
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TETRAD-DIFFERENCES FOR VERBAL SUBTESTS 
RELATIVE TO NON-VERBAL SUBTESTS 


WILLIAM STEPHENSON 


University College, London University 
INTRODUCTION 


The present paper continues consideration of data gathered from a 
population of 1037 girls to whom had been applied a ‘‘non-verbal” 
and a ‘‘verbal”’ group test.*® 

The Spearman Theory of Two Factors has been shown to fit the 
non-verbal subtest intercorrelations with exactness,‘ while the verbal 
subtests only approximately fit the Theory. We observed excess 
error in the tetrads for the verbal subtests, of amount about 0.015, 
and the many sources of error brought forward by us were inadequate 
singly to explain it. At the conclusion of our previous paper,’ it was 
hinted that a summation of many small disturbances might be taken 
to explain the observed 0.015 excess error; but we could suspect, with 
at least equal reason, that further influences might be at work in the 
verbal subtests. This latter suspicion, indeed, has some support from 
data for correlation tables (not reported here) for 100, 200, and 500 
sub-populations of the 1037 girls, where the non-verbal subtests showed 
nearly exact diminution of tetrad error as the sub-population was 
increased while the verbal subtests uniformly showed excess error in 
the tetrads, no matter what the size of the sub-population. 

However, the present paper is to be concerned with intercorrelation 
tables containing both non-verbal and verbal subtests, and new facts 
might be expected to emerge from the comparison that we are to make. 


INTERCORRELATIONS AND TETRAD-DIFFERENCES 


Table I gives intercorrelations for the seven verbal subtests Nos. 
2 to 8° and the four non-verbal subtests Nos. I, III, V, and VIII. 
All were calculated for standard ‘‘normal”’ score distributions;* were 
calculated by the difference method of scores; and were checked in the 
usual way, and at the instigation of the tetrad criterion. We have 
used only four non-verbal subtests instead of the available eight’ 
because a table containing all would be unwieldy for tetrad purposes: 
But in a later table we use other non-verbal subtests. 
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We note in the first place that the battery of seven verbal sub- 
tests correlates 0.65 with the battery of four non-verbal. Assuming 
that other subtests would provide correlations similar to those in 
Table I, a battery of very many verbal subtests would correlate 0.82 
with a battery of very many non-verbal subtests. There is shown, 
then, a large measure of concomitance for the verbal and non-verbal 
abilities, a fact that should not be overlooked when we turn to our 
main detailed interest in the placing of both types of subtests in tetrads. 

We turn at once to this detailed examination of tetrads for Table 
I. The correlations with age show no discrimination for the verbal 
compared with the non-verbal subtests (see Table III‘ and II*), and 
we continue without partialling out age correlation: Correction for 
age correlation usually accounts for at most 0.005 excess error in our 


TABLE I1.—PRODUCT-MOMENT CoRRELATIONS; N = 1037 Giris 





| l | 











| | | 
1 I | V | VII| 2 3 St.) er 2s Ss 
i } | | | | 
| | | | | | 
I _ | 4124 | 3733, 3207 | 3737 | 4217 | 3273 3878 | 3699 | 4064 | 3470 
III |.|...., 4711) 3508 | 3506 | 3279 | 3683 | 3359 | 3821 | 3976 | 4056 
V |.|....!....| 8278 | 4602 | 4145 | 3908 | 3826 | 5029 | 4985 | 4797 
VIIT | .|....|....| .... | 2514 | 2666 | 2532 | 2759 | 3109 | 3067 | 3014 
2 }..--| .... |... | 6013 | 5589 | 5985 5676 | 5845 | 5883 
3 | .... | 4623 | 5924 | 5266 | 5599 | 5106 
4 -| sees | see | -e-e | eee | 4474 | 4508 | 5019 | 5046 
5 Pesce voce [vcee | cose | cone | coco | SOD ORGS | S898 
6 Dees Coes ree ores tee eee 
7 | Pees pees eee eres | 
8 | | | 




















tetrads. We note that specificality might be expected for rys, since 
subtests V and 6 are of analogy type. 


There are 990 tetrad-differences for Table I, with value as follows: 


Mean of 990 tetrad-differences...................-2-000- 0.0355 
Observed pe (conventional, mean X 0.845)............... 0.0300 
Theoretical PE (formula 16A?)..............0...00000005 0.0105 


Thus, the table shows error of amount 0.028 in excess of that attribut- 
able to sampling; and this is not noticeably diminished if tetrads 
involving ry. are eliminated. | 

Of the various tetrads entailed, those for only non-verbal subtests 
give sampling error values, and the tetrads for only verbal subtests 
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entail error of amount 0.018 (in excess of sampling error, as found in 
the previous paper:* This latter, spread over the 990 tetrad-differences, 
would amount to just about 0.002 of the 0.028 excess that is actually 
observed. The remaining tetrads, to be our concern in the following 
pages, may be considered for convenience in groups of the following 


types: 


Tnyvy * Tngng — Teyng* Tring = f (2) 
Tnyvy * Voges — Tryva* Trivg = Sf’ (22) 
Tnyoy * Tagvg — Tryve * Tage, = F (y) 
Tnyng * Toyo: — Trav * Tage = F’ (x) 


Where n stands for a non-verbal, and v for a verbal subtest, the subscripts 
denoting different subtests, taken for all the combinations possible. 


Tetrads of Type z:.—Specificality in the ‘cross’ correlations 
(ron) Will be shown in these tetrads, and it is important to consider this 
type before the z-type, because the ‘‘cross’’ correlations enter critically 
into z-type tetrads. The 84 tetrads of type z,; have the following value: 


ioe Ce Me eee ee og vba eae Paced Wwabers 0.0241 
Obeorved eiema (X O.6745)..... 0 ccc ccc ccc cc cees 0.0194 
Or, observed pe (mean X 0.845)....................000.. 0.0204 
Theoretical PE approximately.......................0.. 0.0105 


We have mentioned that ry. may entail specificality, and the elimina- 
tion of tetrads involving ry. leaves 76 tetrads, with mean 0.0226, 
t.e., observed pe 0.0191 (conventional). The z, tetrads thus entail 
error 0.016 in excess of sampling error. 

To explain this excess we may try out the various possible disturbers 
of tetrads considered in previous papers. That is, can age correla- 
tions, ‘‘speed preference,” “similarity of relations,’’ group testing 
effects, calculation mistakes and the like, account for the residual, 
excess, error, either separately or as a whole? 

Eliminating all z, tetrads that might be disturbed by ‘‘speed”’ and 
“similarity of relations,” leaves 24 tetrads with mean 0.0173, that is, 
observed pe 0.0146, the theoretical value being approximately 0.0105; 
and partialling out the C-measure (the effect of school and class’) 
this mean is reduced to 0.0157, that is, an observed pe 0.0133. The 
excess error is now only 0.008, and it might be accommodated by age, 
and calculational mistakes; using the proper sigma, instead of the 
conventional value (mean X 0.8453), also slightly reduces this 0.008 
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value. Thus, elimination of one or two possible disturbances would 
result in the z, tetrads showing sampling error only. 

Tetrads of Type z2.—There are 420 tetrad-differences of this type, 
with value as follows: 


Mean of 420 tetrad-differences..................000000c- 0.0253 
gD | sr 0.0214 
Theoretical PE approximately....................-e000: 0.0105 


There is excess error of amount about 0.018 in the above tetrads, a 
value similar to that observed for the verbal subtests.* It is obvious 
that any specificality present in the verbal subtests will be potent also 
in these z2 tetrads. 

To explain this excess error we may try out all the possible dis- 
turbers of tetrads known to us: But the result is the same as that 
obtained for the verbal subtest intercorrelations, and we are left with 
excess error of amount about 0.015 (see 5) that can not receive an 
explanation except by way of assuming a summation of many slight 
errors of the kind put forward for the z, tetrads. 

Tetrads of Type y.—The 126 tetrad-differences of this type have 
value: 


Mean of 126 tetrad-differences...................00e00; 0.0171 
Ee I Os co vcccuwsccceceesesseevess 0.0144 
Theoretical PE approximately... ........ccccccccccccces 0.0105 


Omitting the tetrads which involve ry.g brings this mean down to 
0.0164, z.e., observed pe 0.0138. The excess error, like that for the 
z, tetrads, can be accommodated by age, C-measure, and “‘speed”’ 
effect. Even before making allowance for these two or three influences, 
the excess error is only 0.009. The y-type tetrads may thus be taken 
to fit the tetrad theory. 

At the present juncture we should ask whether an explanation 
of the excess error observed in the various tetrads, in terms of a summa- 
tion of a few slight errors due to effects such as age, calculation mis- 
takes, ete., is reasonable and likely to have a factual basis. On the 
one hand the non-verbal subtest intercorrelations show no excess error 
in their tetrads.‘ But the various suggested disturbers of tetrads 
mentioned above (and in previous papers) are authenticated in other 
work. The influence of disturbers such as “‘speed’”’ (for subtests 
3, 6, 7), “propinquity”’ (especially for r7s), and idiosyncrasies (for 
subtests 2 and 5, relative to all others), can be observed by using the 
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verbal subtests 4, 5, 6, and 8, together with the non-verbal subtests: 
It is then found that (on eliminating tetrads involving ry¢) a noticeable 
diminution is observed in tetrads which, apparently, are not disturbed 
by influences of “speed,”’ “‘propinquity,’’ and “‘idiosyncrasies.”” But 
it would require 9 errors, each of magnitude 0.005 (the amount usually 
found for the influence of age correlation), to cover an observed excess 


error of 0.015 (V9 X 0.005? = 0.015). Assuming an excess observed 
error of 0.005 per influence, for each of age, school and class, ‘‘speed,” 
‘‘propinquity,”’ “‘idiosyncrasies,”’ group testing, ‘‘similar relations,” 
and calculation mistakes, a total excess error of amount 0.015 would 
result. The assumption, however, is perhaps at most only on the 
border-line of acceptability. Thus it is possible that the 0.015 excess 
error observed in the case of the verbal subtest intercorrelations,*® and 
for the z. type tetrads, can not be reasonably accepted as due to a 
summation of numerous small errors of the kind brought forward, 
while, on the whole, the excess error shown by the z; and y tetrads 
(less than 0.010 in each case) is possibly reasonably acceptable. 
Tetrads of Type x.—This is the last type of tetrads to be examined. 
The total 252 x-type tetrad-differences have the following value: 





Mean of 252 tetrad-differences.......................... 0.0703 
Observed pe (mean X 0.845)...................0.0000055 0.0595 
Theoretical PE approximately.......................... 0.0105 


Of the 990 tetrad-differences for Table I, (7) all the largest are of x-type, 
and (72) all but three of the z-type are of positive sign when the tetrads 
are in the form given at (x) above. (Of course, unless, otherwise 
stated, no regard is taken of the sign of tetrad-differences, because for 
each positive difference there is an equal negative difference.) 

If we omit tetrads involving ry, 234 x-type tetrads remain, with 
mean 0.0726. No matter what effects we try to make allowances for 
we always encounter excess error of amount about 0.059. If we allow 
0.015 excess error, attributable to a sum of disturbances of the kind 
accepted previously for other tetrads, there still remains excess error of 
amount 0.057 in the 2z-type tetrads for which we have as yet no 
explanation. 

The contributions of the r,, correlations to the z-tetrads are given 
in summary form by tetrads of the following type, where ry, is the 
correlation concerned, taken with all the available non-verbal subtests, 
two at a time: 


Tov * Tain, iad Ton, r Ton, = f (w) 
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There are 12 tetrads of this type for each r,,, and means of these 


sets of 12 are given in Table II. The mean for the 12 w-tetrads for , 


re; is 0.0992, for re, is 0.0910, etc. No single correlation among the 
verbal subtests gives a small difference in these w-tetrads: The excess 
error seems to cover all the verbal subtest correlations relative to 
the non-verbal. 


TaBLE II.—MEANS FoR 12 TETRAD-DIFFERENCES (W-TYPE), FOR TABLE I 
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CONSIDERATION OF THE 2-TYPE ERROR 


It is obvious that neither age, calculation mistakes, nor propinquity 
effects can account for the residual error shown by the z-tetrads. 
Any general effects due to author’s idiosyncrasies can not be taken to 
be potent, because z-tetrads for re3, re4, 726, 727 aNd 723, and the corre- 
sponding correlations for subtest 5, should be free from residual error 
if idiosyncrasies were effective, since subtests 2 and 5 are Thorndike 
subtests, while the rest are of the author’s construction. 

Group testing anomalies can not be taken readily to explain the 
residual error. The effect is not noticed in the tetrads of any other 
type of tetrads: It is true, though, that any effect characteristic of 
verbal subtests only (or non-verbal subtests only) would be observable 
under the critical conditions presented by the z-tetrads. The z-type 
error, however, has been observed previously under conditions free 
from group testing anomalies;? and, further, a test of the influence, 
made by calculating separate r’s for each of eleven testing groups, gave 
results paralleled by the C-measure correlations (given later in this 
article), viz., that the z-type error remained when the influence was 
taken into account. 

If “speed preference’’ enters the various subtests it does so in no 
obvious way. On introspective grounds, from the consideration of 


a he Germ 


e334 


ao =, ot. oe 
PS ae ee ives ac 
Me sy ae. pee, 











~— 340 The Journal of Educational Psychology 


errors made in the test-units, and from the general experience with 
subtests, we must submit that subtests 2, 4, 5, 6, and 8, afford a battery 
giving good power-speed measure, and that the tetrads involving these 
subtests are free from gross “speed preference”’ so far as these subtests 
are concerned. Of the non-verbal subtests, I, III, and IV, are like- 
wise good power-speed tests, calling for the best that a child can give 
under conditions free from hurried, slighted, work. The z-tetrads 
for these subtests show residual error that cannot be differentiated from 
that for any of the other subtests. If a misbalance of the quantity- 
quality function is critical in our subtests, then it would have been 
found in previous work and, certainly, the whole question of the 
function would require re-experimentation from the foundations 
upward. 

We are left, now, to consider the effect of ‘school and class’”’ which, 
we say, may resemble the effects that might be attributed to group 
testing anomalies. 

The Effect of C-measure.—The C-measure' is a score given to each 
girl, the same for each girl in a particular class, on account of school 
“standing” and class position. It is objective to the extent that its 
foundation is the order Standards IV to VIIb. It should serve, to 
some extent, as a measure of scholastic influences, such as reading 
ability. 

The correlations of C-measure with verbal subtests have been 
given previously.’ The following new correlations are required: 
Tox Teup ‘c.v» Te.wum having values 0.3912, 0.3075, 0.3647, 0.2858, 
respectively. With the correlations now available we can calculate 
partial correlations for Table I, for C-measure partialled out. The 
resulting partial intercorrelations show that the non-verbal subtests 
have average intercorrelation 0.2973; the ‘‘cross”’ correlations have 
average 0.2700; and the verbal subtests have average intercorrelation 
0.4503. It is obvious that the tetrads for such a table will agree, type 
for type, with those for Table I. The results for the 2;, z2, and y 
tetrads, for the n-tetrads (non-verbal subtests among themselves) and 
v-tetrads (verbal subtests among themselves), are similar to those 
given above for Table I, except that slightly less residual error is 
observed throughout. The 252 z-type tetrads have value as follows: 


Mean, 252 x-type tetrads, Table I, with C-measure partialled 


Comerved we Green X GOES). 2... 0. ces ccccccccccsccces 0.0541 
Theoretical PE approximately.................00000008- 0.0105 
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Omitting tetrads involving ry. (analogy subtests) leaves 234 
tetrad-differences, with mean 0.0660: Omitting those for rz; and re 
in addition, leaves 213 tetrads, mean 0.0644. 

If we now allow, for these C-measure partialled correlations of 
Table I, the 0.015 excess error which might be attributed to a sum of 
errors due to “speed,’”’ “‘propinquity’’ (especially for r7s), and “‘idiosyn- 
crasies,’’ age, calculation mistakes, and the like, we have still error of 
about 0.0512 for the z-type tetrads, in excess of sampling error. 
The grossest of these disturbances (due to “speed preference’ in 
subtests 3, 6, and 7; to “‘propinquity”’ for r7s; and to “idiosyncrasies, ”’ 
for 2 and 5, relative to the other verbal subtests) should be absent 
from x-type tetrads for the four verbal subtests 4, 5, 6, and 8 (if rs 
is duly accounted for). The z-type tetrads for these four subtests 
with the four non-verbal subtests, for C-measure partialled out, omit- 
ting the tetrads involving rs6, are 54 in number, with mean 0.0511— 
roughly the same as that above for the C-measure partialled table 
as a whole, allowing 0.015 excess error. 

Thus, with what may be described as the refinements entailed 
by the C-measure partials, and allowing for excess error 0.015 as the 
sum of that due to various specified disturbances, the z-type tetrads 
nevertheless show error 0.050 in excess of that attributable to sampling 
and the specified influences. 


A Sreconp INTERCORRELATION TABLE 


We have used in Table I only part of the correlational data avail- 
able, and it should be determined whether other intercorrelation tables 
give results in agreement with those already found in Table I. In 
Table III we have included six non-verbal subtests and four verbal 
subtests: The ‘‘cross’’ correlations for the subtests II and IV alone 
are new. 


There are 630 tetrad-differences for Table IV, with the following 
value: 


Mean of 630 tetrad-differences................. 0000 eues 0.0294 
a 0.0248 
Theoretical PE approximately.................2cceceees 0.0100 


There is observed, once more, appreciable error in excess of that 
attributable to sampling. We now consider the tetrads in the 2, 
Zo, y, and x types. 
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Tas LE III.—PRopvuctT-MOMENT CorRRELATIONS; N = 1037 Girts 

3 4 6 8 I II III IV Vs «VIII 
3 4623 | 5266 | 5106 | 4217 | 3001 | 3279 | 3807 | 4145 | 2666 
4 4593 | 5046 | 3273 | 3058 | 3683 | 2929 | 3908 | 2532 
6 5220 | 3699 | 3281 | 3821 | 3715 | 5029 | 3109 
8 3470 | 3192 | 4056 | 3753 | 4797 | 3014 
I 3463 | 4124 | 3274 | 3733 | 3207 
Il 3484 | 2974 | 3424 | 3797 
III 4140 | 4711 | 3508 
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There are 240 tetrads of type z; for Table III, with mean of amount 
0.0222; allowing for rye and r,;,y;, (the former because both V and 6 
are Analogy subtests, the latter because of the specificality shown in 
the first paper‘) leaves 190 tetrads, with observed pe (conventional, 
7.e., mean X 0.8453) of amount 0.0136. Excess error of amount 
0.0095 is therefore entailed. 

There are 72 tetrads of type z2, with mean 0.0182. Eliminating 
rye leaves 66 tetrads, with observed pe of value 0.0142, 7.e., an excess 
error of 0.0100. 

Some of the y-type tetrads for Table III have been included already 
in the results for Table I, but the 54 new tetrads of this type have 
mean 0.0144, or an observed pe of amount 0.0122, for the theoretical 
PE of amount 0.0100. The excess error is now only 0.007. 

Thus, excess error of about 0.0100 is found for the 2, z2, and y 
tetrads. It is obvious that age correlations will reduce this excess, 
and a sum of slight errors attributable to ‘‘school and class,” calcula- 
tional mistakes, and the like, might possibly explain the excess error 
shown by these tetrads. So far the data confirms that found for 
Table I. 

There are 108 new z-type tetrads for Table III, with mean 0.0520. 
Eliminating tetrads involving ry. and 7ry,y; leaves 90 tetrads, with 
mean 0.0480. These z-type tetrads thus show excess observed error 
of amount 0.0340. It is found that approximately this amount of 


error is associated with each r,, for Table III: The w-type tetrads for 
Table III, for rss, rs6, 738, 746, T48, Tes, Tespectively, have mean 0.0505, 
0.0554, 0.0516, 0.0448, 0.0615, and 0.0484. Thus no single 7,, is 
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associated with small z-type tetrad-differences. The results are 
again similar to those obtained for Table I. The most significant 
tetrads are those of z-type. But when subtests II and IV are included 
in the table of correlations, as in Table III, the various errors concerned 
are less than those met with for Table I, a fact that should receive 
consideration for a few moments. 

The subtests II and IV correlate with C-measure more highly 
than the other non-verbal subtests, and are like the verbal subtests 
in that respect. This may be taken to account for some of the decrease 
in excess error shown by the 90 z-type tetrads, when compared with the 
213 x-type tetrads for Table I. It is a matter of some interest to pay 
regard to subtests III and V, comparing them with subtests II and IV. 
On the one hand, it may be suggested, the subtests II and IV are not 
so ‘‘novel”’ as subtests III and V (or as I and VIII as well); they 
are not so likely to be so dependent upon fore-practice, not-clear- 
understanding of the test requirements, conative inhibitions, and the 
like. It might be considered that effects of this kind have entered 
significantly into the z-type tetrads for Table I, so explaining the 
difference in amount of excess error observed for these tetrads in 
Tables I and III. On the other hand, subtests II and IV can not be 
said to be of much intrinsic worth as instruments in which eduction 
is critical: We would place more value on subtests III and V, which 
are perhaps the best of the non-verbal in point of eduction mechanism. 
We took some pains, in our first paper,‘ to show that r,,,,, could not 
be taken to entail specificality relative to the rest of the non-verbal 
subtests; and the excess observed error for the z-type tetrads for 
‘uv is approximately 0.090. Thus, against a possible influence such 
as ‘‘novelty”’ (including therein influences like not-clear-understanding 
of the test directions or requirements etc.) we can but place the knowl- 
edge that subtests III and V involve eductive mechanism to an extent 
that can not be vouchsafed for subtests such as II and IV. We are 
not disposed to accept the view that ‘‘novelty” and the like can be 
taken to account for the greater excess error observed in the z-type 
tetrads for Table I, chiefly because influences that might be anticipated 
to be as effectual as ‘‘novelty” (such as, say, age effect) do not show 
excess error so large as this under consideration, namely an amount 





0.036, given by V'0.050? — 0.0342, where 0.050 is the observed excess 
error in the z-type tetrads for Table I, and 0.034 is that for Table 
III. However, there is room for further work in which the matter of 
“novelty,” fore-practice, etc., can receive critical attention. The 
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matter, indeed, is to receive attention in some work that we have 
begun with a 500 population. , 

Finally, to continue the consideration of Table III, we must con- 
clude that the z-type tetrads show excess error that can not receive 
adequate explanation in terms of the many influences brought forward 
by us in the course of this and previous papers. The data for Table 
III is similar to that obtained for Table I. 

Thus, after taking account of many sources of error known to us, 
including scale anomalies, the effects of school and class, calculation 
mistakes, age effects, ‘‘speed preference,’’ group testing anomalies, 
etc., we are left with a not inconsiderable observed excess error in 
x-type tetrads: For all the z-type tetrads that can be constructed for 
our verbal and non-verbal subtests this excess error is not less than 
0.050 in amount, after making allowance for sampling and for an 
amount 0.015 attributable to influences such as age, etc. 


THE PROBLEM OF A VERBAL GROUP FAcToR 


A group factor may be shown for the verbal subtest intercorrela- 
tions when G, (the common factor observed in the non-verbal sub- 
tests‘) is partialled out. Using the non-verbal subtests as ‘“‘reference 
values,”’ in terms of which the g-saturations of the verbal subtests 
may be determined,’ we calculate the specific correlations* for the verbal 
subtests among themselves: The results are given in Table IV. The 


’ Table is for G, partialled out. We have to decide whether these speci- 


fic correlations fit the tetrad criterion, so that a group factor may be 
taken to run through them. 


TasBLeE IV.—ParTIAL CoRRELATIONS, FOR G, PARTIALLED OvT; DERIVED FROM 





























TABLE I 
2g 3g 4g 5g 6g 7g «| 8g 
| 
, a 

29 418 356 404 316 335 356 
3g 215 396 252 296 236 
4g 204 180 233 257 
5g 311 333 252 
6g 268 211 
7g 318 
8g 
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There are 105 tetrad-differences for Table IV, with the following 
value: 


Mean of 105 tetrad-differences...................2.00000. 0.0195 
rr ee Pe , . stab ee be neers ben eec 0.0165 
Theoretical PE approximately.................0..00000- 0.0100 


If we eliminate tetrads which involve r3; and 755 (which, we have reason 
to suspect, involve specificality due to similarity of relations), the 
above mean is reduced to 0.0166; 7.e., there is observed pe of amount 
0.0139, for a theoretical PE 0.0100. It is obvious that by partialling 
out age correlation and C-measure this observed pe will become nearly 
exactly the same as the theoretical value. We conclude that the Two 
Factor Theory fits the specific correlations. The new factor, which 
appears to be common to our verbal subtests, may be named V. 

We have arrived at the conclusion, then, that the verbal subtests 
involve two common factors, G,, which we would take to be the uni- 
versal g-factor, and V. This may explain the apparently intractable 
residuum of observed error in the tetrads for the verbal subtest inter- 
correlations (comparing the results found in our first paper‘ with 
those for the second paper,® for the non-verbal and verbal subtests 
respectively), and in the z, tetrads for, as was suggested earlier in this 
paper,* thee excss error there seemed to be on the border-line of accept- 
ability in terms of extraneous influences of the kind ‘‘school and class, ”’ 
age, etc. Now the tetrad effect of two common factors becomes 
scarcely distinguishable from that of only one common factor when 
the correlations are of nearly equal magnitude, and it will be noted 
that the verbal subtest intercorrelations are of such a nearly equal 
magnitude. Thus, if we accept the G, factor found for the non- 
verbal subtests, and the V factor found in addition for the ver- 
bal subtest intercorrelations, then the results obtained for the tetrads 
of the verbal subtest intercorrelations fit well with the theoretical 
expectations. 

We have now to turn to consider explanations of the V common 
factor, of the observed excess error in the z-type tetrads. 

If the various effects and conclusions considered in the course of 
our papers are acceptable, we have inductively narrowed the field 
of possible explanations for the V-factor. We are left, indeed, with a 





* Under Tetrads of Type Y. 


3 


eo ae 
cong “eS om 
eos Te, 

os Se 


evar 
7 


no aero 


ae 











ry 





ee Die LE ee 
- he eee “ 
fp 


Re, ik SO ae Oe REE ID Re etre Sey gow Sead! 
we See a NS Se 5s peat a 3 
Se Nar cae fs Se es er OR nies Fe 
a PAR ty 


ee teh ee te —. 
et ee 4 


oe ™ hey pod ee ey eee 
eS dasa guess 
7 > ae eee 
* ary ey ) te "an ‘ 


346 The Journal of Educational Psychology 


possible explanation in terms of “‘verbality.’”’ It is true, however, 
that doubts remain, especially concerning the influence of ‘‘speed,’’ 
‘‘novelty”’ and fore-practice, and conative differences in the verbal and 
non-verbal subtests.* However, our data are not put forward as 
a finished product, but as primarily an exercise in correlations, tetrads, 
and errors other than sampling. The verbal and non-verbal subtests 
used in our work, while being similar to those generally used in “‘intellj- 
gence”’ tests, are not completely suited to work that would make 
contact with matters of scientific psychology: We can now employ 
tests which entail eductive principles more thoroughly, we can con- 
struct better non-verbal subtests on primarily perceptual lines. We 
have under way a re-experimentation with a 500 population, which, it 
is hoped, will go far to place us correctly on the path of facts of some 
psychological value. It is with knowledge of the limitations, and the 
need for further work, that we turn to consider our V-factor in terms 
of ‘‘verbality.”’ ) 

We should ask, in the first place, how wide the V-factor might 
be expected to extend. Is it likely that it will extend through all 
verbal subtests whatsoever, so constituting a verbal general factor? 
Or is it likely to be confined to the collection of verbal subtests used in 
our experiment, so constituting a limited factor? 

We have observed disturbances in z-type tetrads previously,’ 
for other verbal and non-verbal subtests. But it is more important 
to recall that work by Davey,! although obtaining results that cor- 
respond apparently closely with ours, yet found that results for two 
verbal subtests did not correspond with the more usual result, indicat- 
ing instead the absence of any verbal general factor or factors. Thus, 
Davey took the view that the specificality shown by verbal subtests 
must be of limited range. Further, the specificality observed for some 
of her verbal subtests was attributed to “similarity of relations.’ 

Consideration of Davey’s Data and Conclusion.—Davey’s work was 
with verbal subtests, orally applied, and pictorial subtests. Now 
pictorial subtests are at best only secondarily perceptual. Words and 
sentences are used in the test directions of all our subtests, but once the 
purpose of the subtest is “‘set,”’ a difference between our verbal and 
non-verbal subtests is that words are not ostensibly used, even implic- 





* One possible influence, so far not mentioned, is that of an effect due to the 
subtests being applied a day apart. Previous work, however, shows that such an 
influence must be slight, at most 0.005 error resulting in z-type tetrads. 
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itly, in the non-verbal subtests (excepting, say, subtest IV). This 
condition, however, might not have held in Davey’s pictorial subtests. 
Again, some of Davey’s pictorial subtests had words or sentences 
directly involved (pictures had to be selected to match short para- 
graphs with words omitted; names of items in pictures had to be 
written down, etc., these verbal parts being orally applied). Thus, 
there is a possibility that word-entailment, with possible “‘reproduc- 
tion” individual differences,’ is not subjected to sufficiently critical 
experiment in the pictorial subtests. Some of our non-verbal sub- 
tests are more primarily perceptual than pictorial subtests can be. 

We have some evidence that the oral subtests used by Davey entail 
specificality relative to other verbal subtests;* and some of the specif- 
icality observed by Davey might be attributable to an influence of the 
oral presentation. The oral Analogies, Opposites, and Classification 
subtests entail specificality which Davey attributed to “similarity of 
relations, ’’ and our work offers some support for such a specificality in 
these subtests. But the V-factor in our work is observed over and 
above this latter specificality, and likewise over and above any factor 
attributable to presentation of the test material. It seems that the 
V-factor is over and above any specificality found by Davey, a matter 
that would follow upon acceptance of the arguments put forward in the 
previous, and following, paragraph. 

Again, the oral Inferences and Likelihood subtests, which alone of 
Davey’s subtests did not show excess error in the z-type tetrads, and 
upon which the conclusion concerning a limited factor depends, cor- 
related only 0.35 (approximately). Under the conditions most verbal 
subtests correlate more highly. The test-units in these two subtests 
are somewhat lengthy and complex in structure, especially perhaps for 
eight to ten year olds of Davey’s eight to fourteen year group: It is 
possible that not-clear-understanding of the test requirements, or 
excessive memorization effect,’ or lack of ability of keep the whole 
test-unit before the ‘‘mind’s eye,’’ have entered critically into these 
two subtests, resulting in a smaller correlation than might be expected 
in the circumstances. 

Thus, the facts are perhaps not too strongly in support of the con- 
clusion concerning a limited verbal factor, and nothing hitherto found 
in this field can be taken to contradict the facts found in our work. 

After eliminating disturbances due to various possible influences, 
our work indicates with some certainty that specificality runs through 
our verbal subtests, and amounts to a single factor that we have 
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named V. The specificality would appear to be the more discernible 
the more the non-verbal subtests tend towards being primarily per- 
ceptual. Further experimentation is required before we can decide 
that the V-factor is very extensive, but our work may be taken to 
tend to indicate that the V-factor would possibly extend through most 
verbal subtests in current tests of ‘‘intelligence.”’ If our consideration 
of Davey’s conclusion has any factual basis, then we may conclude that 
no work extant contradicts the possibility of their being V-specificality, 
approximately a single V-factor, of very wide range. 

The possible contact of this V-specificality with ‘reproduction’ 
has been touched upon in the previous paper;> and we would suggest 
that the facts of the general law of retentivity of dispositions,’ intro- 
spection, knowledge that we have of individual differences in the cate- 
gories specified,® and the facts of aphasia, all tend to support a view 
that V-specificality might be expected as the consequence of individual 
differences that we have covered by the term “reproduction’’ 7.¢., 
reproduction of words, phrases, sentences, or ideas. The theory, in 
any case, seems worthy of further experimental attention. It is not 
improbable that “‘reproduction”’ constitutes an influence in the ‘‘speed 
preference”’ effect in verbal subtests. 

For the present, then, we would say that it is possible that when 
reproductive influences are allowed scope in subtests, then V-specif- 
icality would tend to show. Similarly, no doubt, vocabulary can 
augment the V-specificality; although our data give no indication that 
the V-specificality is conditioned by an antithesis that was employed in 
the verbal subtests. (Subtests 3, 6, 7 and 8 were constructed of words 
with simple concrete meaning, while 2, 4, and 5 contained words of 
more abstract meaning.) 

In point of fact we obtain V-factor extending through all our verbal 
subtests, tending to indicate a general V-factor in addition to the uni- 
versal g-factor in verbal subtests. We have mentioned previously, 
however, that the conclusion is tentative. There is room for improve- 
ment in the matter of the facts, and we await with interest the results of 
our second experiment with other verbal and non-verbal subtests. 


CONCLUSIONS 


We have covered a mass of correlational data in the course of two 
papers ‘> and the present one, dealing with subtests applied to a 1037 
population of girls. No doubt brevity has left marks of ambiguity or 


aoa 








cr _ ~~ * = 


~ 


al 


wo 
37 
or 


Tetrad-differences for Verbal Subtests 349 


of unintelligibility, for we have had to give conclusions frequently 
without much reference to the relevant data. There are likely to be 
found errors of calculation or transcription. But any reworking of the 
correlational data would, we submit, give us no cause to depart from 
the following general conclusions: 


1. The non-verbal and verbal subtests have a high correlation, amounting to 
0.82 for a summed correlation for many subtests of both kinds. The fact stands 
in opposition to the opinions that have sometimes depicted the two abilities as 
independent. 

2. In the case of the non-verbal subtests, the tetrad-differences matched with 
some exactness the value to be expected from sampling error. This held good 
regardless of the size of the sample, for sub-populations of 100, 200, 500, up to 
the full 1037. 

3. In the case of the verbal subtests® the tetrad-differences were appreciably 
larger than those to be expected from sampling alone. After making allowances 
for evident disturbances there remained excess error of amount approximately 
0.015, a value scarcely to be accepted as merely due to disturbances of the kind 
considered in explanation of it. 

4. But much higher residual error came from the tetrads involving both verbal 
and non-verbal subtest correlations: This was of magnitude about 0.050, thus 
indicating with certainty some factor or factors in one or both of the two kinds of 
subtests. 

5. On closer examination the evidence was against any group factor in the 
non-verbal subtests, but was in favor of one group factor extending rather evenly 
throughout the verbal subtests. Such a factor, moreover, would explain the 
roughness of the fit of the tetrad formula to the verbal subtest intercorrelations, as 
mentioned at (3) already given. 

6. On the whole, the indications are that this V-factor extends through all 
verbal abilities, and therefore may be called a general factor V (as contrasted with 
the universal general factor g which is found in both verbal and non-verbal subtests 
alike). On this matter, however, there are required further facts: These are to 
receive consideration in a subsequent experiment. 


In conclusion, we would offer, once more, our very sincere thanks 
to Professor Spearman for willing guidance, for many aids to a clearer 
view of facts considered, and for many kindnesses given throughout 
our work in his Laboratory. It has been a true pleasure to have been 
a student and Research Assistant under Professor Spearman. 
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ORGANISMIC PSYCHOLOGY AND EDUCATIONAL 
THEORY? 


KENNETH SELTSAM 
University of Minnesota 


There has, within the last five years, come into prominence a 
so-called new school of psychology known in its original setting as 
‘“‘Gestalt-theorie,” as ‘‘Configurational”’ in the translation of Titch- 


ener,” and still more recently in the language of Wheeler, as ‘“‘Organ- - 


ismic.”"* Very few people are aware of the fact that in point of origin, 
at least, it is almost contemporaneous with ‘‘behaviorism,”’ that if 
anything, it is older of the two. Whatever may be the opposed 
contention of systematic authorities, the members of the school 
consider it to have begun with a study of ‘‘apparent-movement”’ 
made by Wertheimer in a German laboratory some time in 1912. 
Its longer survival than that of behaviorism, it is said by some, may 
be attributed to the fact of its relative slow growth. However that 
may be, the organismic thought is demanding serious consideration 
from psychologists in general. It is forcing the perhaps too long 
unquestioned orthodox psychology to make a more careful invoice of 
stock. And in that probing process, if we may so name it, educational 
theory has not been, and cannot be, immune. 


THe Main CONTENTIONS OF ORGANISMIC PSYCHOLOGY 


Before considering organismic, or gestalt psychology, as it is related 
to educational theory, it would perhaps be advantageous to inquire 
into the school’s main contentions. Roughly they may be classed 
under two heads: Those relative to the circumstances involved in 
any event; and secondly, those pertaining primarily to the individual 
experiencer. To the first of these, would be given the name situation- 
as-a-whole. A response, the organismic psychologists say, is never 





1The Editor announces with regret the death of Kenneth Seltsam on Nov- 


ember 30, 1930. This paper was prepared by Mr. Seltsam while at the University 
of Kansas. 


* Helson, H.: American Journal of Psychology, Vol. XXXVI, pp. 343, 495; Vol. 
XXXVII, pp. 25, 189. 


3 Wheeler, R. H.: ‘‘The Science of Psychology.” 
‘Boring, E. G.: ‘‘ History of Experimental Psychology.” 
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made to an isolated stimulus. The S — R description of events which 
is presupposed by the theories of association, attention, and behavior- 
ism is obviously not true to the facts of the case. A response (to 
state the law of configuration) is made to a situation-as-a-whole, 
and if to any particular detail, always to that detail in its relation 
to the other details. The organismic position maintains also that the 
psychological situation is not different from the chemical or the 
physical in its general aspects, and is, therefore, governed by similar 
natural laws. 

One of the natural laws which is seen to have considerable psy- 
chological significance is the Law of Least Energy, whose statement 
for application to the field of psychology is not essentially different 
from that for other sciences. If movement of energy is always in 
the direction of a low potential, one immediately comes to realize 
that such concepts as ‘‘Trial and Error” are at best absurd. It is, 
according to this manner of facing the problem, absolutely impossible 
for an organism to act without accomplishing something with regard to 
the point of low stress, which organismically is defined in any learning 
situation as the “‘goal.’”’ In general, the situation-as-a-whole is seen 
to be an enormously complex thing. As such, it cannot be represented 
by the tremendously ‘“‘un-complex”’ symbol S, as that sign has his- 
torically been used. 

As one might expect, the organismic psychologists have still more 
to say of the organism-as-a-whole. They rebel violently against the 
atomistic physiology of the conditioned reflex school and others. 
They maintain that structural analysis can never act as an end in 
itself; that to assume that psychology is to reach the ranks of a science 
through the analyses of great introspectionists, as the Titchenerian 
formula reads, is an impossible assumption. The organism-as-a- 
whole is to be approached as an integrated unit through the medium 
of functional analysis. Experimentation is to be conducted by 
alteration of the conditions; and when this alteration takes the form 
of extreme limitation (as in the case of neural study) it is to be recog- 
nized that the results are to a certain extent, at least, abstractions 
from the real phenomena. Just as it is impossible to think of a situa- 
tion-as-a-whole in terms of S, so it is, in the light of the organismic 
position, impossible to label the reaction of the organism-as-a-whole 
as R, simple response. 

In general, the major contentions of the new school may be summa- 
rized very crudely in what would seem for them a definition of the 
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science. Psychology is a science of ‘‘wholes”’ which deals with the 
responses of organism-as-wholes to situations-as-wholes. Anything 


short of these, on the one hand is essentially physiology, and on the 
other, physics. 


THE CRITICISMS OF GESTALT, OR ORGANISMIC, PSYCHOLOGY 


The structuralistic psychologists, in the meantime, have not seen 
fit to accept these violent criticisms as such. One should not expect 
them to do so. One of the typical rejoinders, and one perhaps less 
logical than any other, is that the new school is, after all, not new. 
One finds such statements as the following from Boring: ‘‘ Like James, 
Dewey was a Gestalt-psychologe twenty years too soon.’’! Squires 
in a recent comment demands that above all else in our dealings with 
the new school of thought, we maintain a “‘ balanced historical sense.’’? 
In a semi-popular discussion, involving about as much over-statement 
as might be expected of such, Robinson would have us believe that 
not only is the principle of configuration not new, but of questionable 
actual significance.* 

The problem of the age of any line of thought is strikingly para- 
doxical. Age is at once both desirable and undesirable. Were 
is not possible to trace organismic thought back through the history 
of psychology—through Dewey, McDougall, James, the Mills, Stout, 
and even back to Empedocles in the third century B. C.—orthodox 
psychology might have greater ground still for criticism. That 
it has been a development, an evolutionary outcome, rather than 
a scientific upstart should, it would seem, be a favorable indication. 
It is, also, quite possible to reconcile the fact of age with the idea 
of a new contribution to make. To assume otherwise is to admit that, 
since ‘“‘there is nothing new under the sun,” it is futile to attempt 
originality, and that neither sociological nor more narrowly scientific 
movements have any essential evolutionary value. 





1 Boring, E. G.: History of Experimental Psychology. American Journal 
of Psychology, Vol. XLII, Apr., 1930, p. 540. 

2 Squires, P. C.: Gestalt Psychology and the Gestalt Movement, A Criticism 
of the Configurationist’s Interpretation of ‘‘Structuralism.”” American Journal 
of Psychology, Vol. XLII, Jan., 1930, p. 134. 


* Robinson, E. 8.: A Little German Band. New Republic, November 27, 
1929, p. 782. 
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Since a great deal of the organismic experimentation has been 
conducted with animals,! in a manner somewhat strange to traditional 
method, it is said by some that the gestalt approach is a return.to 
anthropomorphism. What could be, however, in the genuine sense of 
the word more so than the typical experiment out of which has grown 
the concept ‘‘trial and error’ is indeed difficult to imagine. Any 
experimentation which sets as its criterion of success a performance 
demanding of an animal at least average human intelligence is ultra- 
anthropomorphic. If, however, we are to hold that an experiment 
which gives the animal an “even chance” is thus to be described, it is 
possible that after all being anthropomorphic is not the “‘crime’’ it 
once was. 

Again, we are told that gestalt, or organismic, psychology is purely 
theoretical—the mind-child, so to speak of over-zealous Teutons. 
‘Since nothing has been proved (so the criticism runs) very little con- 
cern need be had. This again seems hardly fair. Were we to assume 
such an absolute standard in our dealings with scientific facts and 
methods, progress would be permanently halted. Experimentation 
of any kind presupposes a certain number of working hypotheses. 
However one may use logic, it is difficult to imagine an experimental 
situation not demanding a certain viewpoint of approach which, of 
necessity, must remain in the theoretical realm. To be sure, any 
working hypothesis deserves to stand or fall in terms of experimental 
verification. Untilthat time, however, to say that certain assumptions 
are valueless simply because they have not been tested is at best a 
misconception. 

Still again the gestalt contributions have been said to exist essen- 
tially in an alteration of concepts. Instead of “trial and error,” 
“apperception” or ‘libido’ we have “insight.” ‘‘Maturation’’ 
has taken the place of the “‘stamping-in process.’”’ Andsoforth. Itis 
true that the organismic psychologists might give the name “‘condition- 
ing” or “learning by repetition’”’ to certain aspects of the learning 
situation they would describe. So, too, one might call one’s office- 
chair “water.’’ In the sense that there are certain fundamental 





1 Kohler, W.: ‘Mentality of Apes”; Wheeler and Perkins: Configurational 
Learning in the Goldfish. Comparative Psychology Monograph, Vol. VII, No. 1. 
March, 1930; Helson, H.: Insight in the White Rat. Journal of Experimental 
Psychology, Vol. X, 1927, pp. 378-396; Lewis, M. H.: Elemental versus Config- 
urational Response in the Chick. Journal of Experimental Psychology, Vol. XIII, 
No. 1. 
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finite constituents such a statement would be correct. It would, 
nevertheless, not be true historically. And, after all, a concept has 
value only insofar as there is not contradiction in its historical usage. 
Essentially, what lies behind such criticisms as those of Kuo, who would 
so to speak, “‘junk’’ all concepts, is a dissatisfaction with the tendency 
to so jumble their meanings as to make them practically useless. ! 
A concept, strictly speaking, must suggest subtle differences and to do 
so it can not be so overloaded as to mean everything. As such, it is, 
prematurely, destined to mean nothing. 

There have been still other criticisms.? Generally speaking, 
each involves a certain validity. It would be foolish to contend that 
gestalt or organismic psychology is in any sense a perfect creation, 
against which all criticism is misplaced. As a school it 7s both old and 
new, theoretical, and perhaps over-ambitious. So long, however, as it 


has an amplification and a freshness of approach, no true scientist can 
well overlook its attempts. 


ORGANISMIC PsYCHOLOGY AS A CONTRIBUTION TO EDUCATIONAL 
PHILOSOPHY 


The discussion so far has dealt rather summarily with the contro- 
versial aspects of ‘‘Gestalt-theorie’”’ as a school of general psychology. 
It shall now be the attempt to analyze what seem to be its contributions 
to educational theory. Such an analysis must, for the most part, 


be based upon opinion and logic, since educational experimentation in 
the field is almost non-existent. 





1 Kuo, Z. Y.: Psychological Review, Vol. XXIX, p. 344. 
* Calkins, M. W.: Critical Comments on the Gestalt-theorie. Psychological 
Review, Vol. XXIII, 1926, pp. 135-158. 


Hsiao, H. H.: A Suggestive Review of Gestalt Psychology. Psychological 
Review, Vol. XX XV, 1928, pp. 136-141. 

Pillsbury, W. B.: Gestalt versus Concept as a Principle of Explanation in 
Psychology. Journal of Abnormal and Social Psychology, Vol. XXI, 1926, pp. 14-18. 

Rignano, E.: The Psychological Theory of Form. Psychological Review, 
Vol. VIII, 1925, pp. 118-135. 


Lund, F. H.: The Phantasy of Gestalt. Journal of General Psychology, July, 
1929. : 

Woodworth suggests that the S — R description be changed to read S — org 
~ R. The organismists of course maintain that this description is still extremely 


atomistic. It is nevertheless, something of a recognition of the organism-as-a- 
whole. 




















356 The Journal of Educational Psychology 


In education, if any place, it should be desirable to consider stimu- 
lating circumstances as wholes. We may, perhaps, owe the develop- 
ment of the behavioristic concept ‘‘stimulus,”’ in the extreme isolated 
sense, to such formulation as Morgan’s law of parsimony in science. 
Forever, we have been attempting to “strip’”’ experience to the limit. 
The lay tendency in this direction is well shown by a study of the 
behavioristic reactions of cattle made some time ago by Stratton.! 
In this analysis, it becomes apparent that there is no correlation 
between the presence of a red flag and anger in cattle. The anger 
reaction is aroused only when the total situation is established—when 
the flag is waved in an annoying manner, when the waver jumps about, 
making peculiar vocal sounds and stirring into motion the dust about 
him. 

The language field again provides an interesting source for examples 
of the inadequacy of the “‘simple stimulus” theory. If I present the 
words ‘‘Blanche White Weiss” in a free association test, to a group 
of subjects including different ages and professions, the reactions are 
certain to vary tremendously. To the naive, ‘‘Blanche White Weiss” 
is not more than a rather peculiar name. To the person well trained 
in languages, it seemsas three repetitions of the same word, ‘‘ White 
White White.”’ Between these two extremes and varying directly with 
knowledge of French, German and English, will be intermediate 
reactions. Yet, throughout, the physical stimulus is exactly the 
same. 

Until the teacher comes to appreciate the fact that the stimulus 
which he presents exists only in relation to the learning situation-as-a- 
whole, the reactions he observes will represent to greater or less degree, 
a dilemma. Only when he realizes that a word of sarcasm, a smile of 
contempt, or an unfair concession is bound in most cases to emerge 
from that whole and dominate it, will he be aware of the inadequacy 
of independent emphasis upon some bit of subject-matter. His 
ability to understand why certain “‘stimuli” fail to call forth certain 
‘responses,’ will vary directly with his comprehension of a tremen- 
dously complex situation. He is not, after all, the first violinist in 
a metropolitan orchestra. He is a whole orchestra in himself, and 
while one sound, that of his one arm violin, may dominate, it does so 
only as an emerged part of a situation-as-a-whole. 





1§tratton, G. M.: The Color Red and the Anger of Cattle. Psychological 
Review, Vol. XXX, p. 321. 
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However prone we may be to express disapproval of what his- 
torically was ‘‘faculty” psychology, we have continued in a striking 
degree to do ‘‘faculty thinking.” We talk of ‘“‘reflexes” and of 
“automatic habits.” We continue to debate the questions as to 
just what parts of our organizations are native and what acquired. 
In our attempt to be scientific, we have seen fit to study anatomy and 
physics as atomistic sciences, instead of the psychology of learning. 
A child has been educated in history and geometry, not in general 
understanding. His biology teacher has refused to aid in the correc- 
tion of his English. A premium has been placed upon his accumulation 
of an abundance of rather meaningless data, even by methods admit- 
tedly unethical. It seems we seldom stop to think that we are 
educating a whole child, who exists as an intact organism and only as 
such. 

We continue to define learning as “‘little more than the establishing 
of bonds between responses and stimuli,’’! apparently naive to the 
discoveries of such experiments as those of Haller in the eighteenth 
century,” and the masterly studies of Lashley in our own period.* 
A stimulus has been considered to effect certain specific connections, 
not the whole organism. With Gates, we have said, ‘‘When once a 
desired reaction is made, it may be stamped in by further exercise 
until well learned, so that later it occurs promptly and surely.’’‘ 
Learning, through experience as such, or drill, has been emphasized. 
“At present the attitude of many teachers toward drill work is under- 
going a wholesome change. Weare coming back to drill. The neglect 
of necessary drill work has been one of the bases for just criticism of 
the ‘soft pedagogy’ in the schools.’’> All the while we have been 
speaking with confidence of something of which, as Lashley tells us, we 
know very little. 

At the same time, the human organism has been made to act 
in a fashion foreign to any other natural phenomenon. He has been 
said to learn through a meaningless repetition of muscular movements. 
Vaguely, the idea of goal-seeking, or action toward a point of low 
organismic tension, has been recognized for a long time. For Kil- 





' Douglass, H. R.: ‘‘Modern Methods in High School Teaching.” Pp. 22. 

? Haller concluded that by the time a man died, he would, according to the 
trace physiology, have to have 200,000 impressions for every gram of grey-matter. 

* Lashley, K.: ‘‘Brain Mechanisms and Intelligence.” 

* Gates, A. I.: ‘Psychology for Students of Education.”” Pp. 212. 

*Stormzand, M. J.: ‘Progressive Methods of Teaching.”’ Pp. 227. 
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patrick, it is ‘‘mind-set-to-an-end.’”’ He also suggests that before 
reaching that end the organism may, in the process of choosing certain 
means ‘‘be torn within.”! Almost nowhere, however, is a cue given 
to an understanding of the organism as a part of a natural scheme, 
all of which, it is rather logical to suppose, is governed by comparable 
laws. 

We have stumbled time after time in our attempt to explain why 
it is that John Jones, who has practically the same mental capacity 
(as measured by certain criteria) as Sam Smith, does not see fit to 
‘“‘apply” himself. We have said, ‘‘Sam likes to study.” Never, 
apparently, have we appreciated thoroughly the fact that possibly 
there is Just as even a balance of tension in one case as in the other. 
For Sam Smith, study provides the greatest resolution of a most 
complex system of stresses, internal and otherwise. John Jones, 
on the other hand, is organized differently. He had different goals, a 
different configuration of stresses, and consequently is a different 
whole. Both individuals function according to the same general 
principle—organismically, the law of least energy. The thought here 
is not new. In every language there are proverbs expressing in a 
crude way the idea involved. But as a point of scientific emphasis in 
psychology, it is relatively new and can, if correctly appreciated, lead 
to a great clarification in character and personality study. 

Finally, organismic psychology forces educational philosophy 
to reconsider whether the hedonic conception, which has abounded 
since Socrates and no doubt even before, is after all, the true con- 
ception. Is “happiness” a goal of man’s activity, as such? Does 
“practice with satisfaction”’ explain selection for continued usage? 
Do the very possibilities of attaining happiness fade immediately 
as that most abstract of all abstract things is aimed toward? If 
education is to answer negatively with the gestalt or organismic 
psychologists to these questions, the whole view of education will be 
more or less altered. It will be broadened. The organism to be 
educated will be seen as an emerged part of a larger whole which in 
turn, stage by stage, may be said to include the whole universe. The 
organismic psychologist’s ‘‘human” is a larger being. He is not an 
isolated phenomenon depending for his growth upon some philo- 
sophically abstract principle. He moves for the same reason that a 
current of air moves. For him, experience is hearing, seeing, and 





1 Kilpatrick, W. H.: ‘‘Foundations of Method.” Pp. 163. 
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feeling all at once. He does not see without hearing, nor does he hear 
without seeing. He matures in the same way that any physical body 
changes. He is a being “‘attuned to the world,” and understandable 
only as such. 


In conclusion, it would seem that we can believe with Boring that: 


The progress of thought is gradual, and the enunciation of a ‘“‘new”’ crucial 
principle in science is never more than an event that follows naturally upon its 
antecedents and leads presently to unforeseen consequents.' 


and still agree with Squires: 


Every system has had something of value to contribute to our total fund of 
knowledge.? 


If gestalt or organismic psychology has helped to clarify our 
picture of the human organism that we would educate, if it has stimu- 
lated inquiry and application (and one has but to examine journals in 
speech education,’® music,‘ and so forth® to realize the fact) certainly 
we must recognize that the organismic school has a contribution to 


make to the theory and practice of education which can not well be 
ignored. 





1 Boring, E. G.: The Gestalt Psychology and the Gestalt Movement. American 
Journal of Psychology, Vol. XLII, No. 2, April, 1930, p. 308. 

2 Squires, P. C.: A Criticism of the Configurationist’s Interpretation of ‘‘Struc- 
turalism.”” American Journal of Psychology, Vol. XLII, No. 1, January, 1930. 

*Gray, G. W.: Gestalt, Behavior and Speech, Quarterly Journal of Speech, 
Vol. XIV, Nov., 1928, pp. 530-534; Gestalt Again. Vol. XV, Feb., 1929, pp. 85- 
92. 


Parrish, W. M.: Implication of Gestalt Psychology. Quarterly Journal of 
Speech, Feb., 1929. 


Woolbert, C. H.: Psychology from the Standpoint of the Speech Teacher. 
Loc. cit., Feb., 1930. 


‘Webber, R. A.: Good Practice Judgment. Violinist, Vol. XLIX, No. 6, 
June, 1929, p. 199. 


Kwalwasser, J.: Music Appreciation; Is It Vital? 
Vol. XVI, No. 4, Mar., 1930. 


* Moore, R. and Van Waters, M.: The Child and the New Psychology. Li- 
braries, Vol. XXXV, Mar., 1930, pp. 117-120. 
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A GROUP INTELLIGENCE TEST SUITABLE FOR 
YOUNGER DEAF CHILDREN 


R. PINTNER 


Teachers College, Columbia University 


Up to the present time no group intelligence test has been suitable 
for the deaf child in the beginning grades of deaf schools. All of the 
group intelligence tests for the beginning grades in hearing schools, 
such as the Detroit First Grade, the Otis Primary, the Pintner-Cun- 
ningham, etc., make use of verbal directions in giving the test, and 
hence are not suitable for the deaf. On the other hand, the Pintner 
Non-language Mental Test, which has proved to be well-adapted for 
deaf children is too difficult for deaf children below the age of nine or 
ten. Hence the intelligence of young deaf children from about age 
five to nine has not so far been measurable by means of group tests. 
The new Pintner Primary Non-language Mental Test! would seem 
to fill this gap. 

This primary non-language test was constructed for the measure- 
ment of intelligence below the range covered by the Pintner Non- 
language Test, which is of little or no use below Grade II. Like the 
latter, it is non-verbal in content and makes no use of language in 
the directions. These are carried out by means of pantomine and by 
examples of procedure performed on the blackboard. It was con- 
structed as a non-language test for kindergarten, Grade I and Grade 
II, so as to counteract any language handicap that may be present 
among young children coming from homes of varied amounts of famil- 
iarity with the English language. A recent try-out of this test in 
two deaf schools would seem to show that it has decided possibilities 
as a group test for the young deaf school child. 

The test itself consists of four sub-tests. In the first the subject 
has to note the object or picture held up by the examiner and then 
find this object among. the pictures in the test booklet and mark it. 
There are nine such items. Sub-test two calls for the completion of a 
geometrical form, the correct form being always in view as a guide to 
the subject. The third sub-test requires the completion of a face, 
each item having various parts of the face omitted. The last sub-test 
requires the subject to note the position of the arms of the examiner 


1 Published by the Bureau of Publications, Teachers College, Columbia 
University. 
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and then draw them in the correct position on the appropriate picture 
of a manikin in the test booklet. The test is difficult to give and 
requires much preliminary rehearsal on the part of the examiner, but 
when given properly it can be understood by children entering school 
for the first time either in the kindergarten or first grade. 

This test was given to the lower grades in two schools for the deaf 
for the purpose of testing its suitability for the young deaf child.' 
Practically all of the children in the lowest grades in both schools were 
tested and a sampling of older children in order to see how far up the 
test would be discriminative. In all two hundred ninety-three chil- 
dren were examined, ranging in age from age four to age fifteen. Table 
I shows the frequency distribution of the scores by age for all children 
tested. Up to age eight inclusive the cases are fairly well scattered 
over the total range of scores; but from age nine onwards the cases 
begin to be bunched at the upperend. The median scores rise steadily 
from age four to age nine, where they reach a level. It is obvious, 
therefore, that the test is too easy for age nine and above. The lack 
of zero scores at any age would seem to indicate that the test is not 
too difficult for the ordinary deaf child during the first few years in 
school. It would seem, then, that we have here a test suitable for the 
young deaf child between the ages of four to eight, a period of develop- 
ment not covered at present by any other suitable non-language group 
intelligence test. The content of the test proved of interest to the 
children. The teachers of the deaf children commented upon the 
suitability of the content and presentation of the test for deaf children. 

So far this test has not been adequately standardized on hearing 
children, but a comparison of the deaf children’s scores with the 
tentative norms for the hearing may be of interest. The results up 
to the present time are as follows: 








CA | 56 | 66 7-6 | 8-6 
| 
Median Scores: | | | 
I 9.0 4 50d ened cour hac } i | 45 | 55 | 65 
ie ee | 49 56 | 61 





1The writer wishes to acknowledge the splendid cooperation of Dr. Harris 
Taylor of the Institute for the Improved Instruction of Deaf Mutes, and of Miss 
Carrie Kearns of P. S. 47, Manhattan. Two graduate students, Mr. Chester 
Bennett and Miss Elsa Richards had charge of the testing and the writer is indebted 
to them for their careful work. 
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The high score for the five-year-old deaf is based upon only nine 
cases and these probably represent a very superior group. At the 
other three ages, where we have a better sampling of deaf children 
the scores are about the same as those for hearing children. This 
result is in marked contrast to most comparisons between the deaf 
and hearing on group intelligence tests, where the deaf are usually 
very far behind the hearing. It may be that our tentative norms 
for the hearing are too low, or that the deaf children tested are a 
much superior sampling to deaf children in general. Further work 
with both deaf and hearing children will clear up these points. 


TABLE I.—DiIsTrRIBUTION OF ScorEs BY AGE GROUPS 
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30-4 1 81213 

25-9 1 1 
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5-9 te se: 
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, re 5 | 9| 28 | 34 | 49 | 49 | 49 | 31| 20) 8| 6| 5 | 293 
Median....... 13 | 31 | 49 | 56 | 61 | 66 | 65 | 68 | 68 | 66 | 71 | 68 | 



































There is another possible explanation for this equality or superior- 
ity of the deaf child on this test. One of the examiners, Mr. Bennett, 
believes that the deaf child has an advantage over the hearing child 
inasmuch as his whole training teaches him to pay more attention to 
gestures, facial expression and the like, and hence he is better able 
to interpret the pantomimic instructions of the examiner and profit 
more from them than the hearing child does. 
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SUMMARY 


A group intelligence test for young children has been constructed, 
which is presented by means of pantomime and examples on the black- 
board. No language is required to understand the directions or to 
respond to the test items. By means of this test it is possible to test 
young deaf children in schools for the deaf. It is the first group test 
suitable for such cases. The results so far obtained would seem to 
indicate that it is discriminative for ages four to eight inclusive. The 


deaf children so far tested slightly exceed the tentative norms for the 
hearing. 


qm 
oe 


se 
” i 
7 ¥ 

i 3 




















WHAT IS MEANT BY A G FACTOR? 
TRUMAN L. KELLEY oe 
Harvard University 


The purpose of this article is to discuss what constitutes the proof 
that a common factor is sufficient to account for the intercorrelations 
between a number of variables. The particular occasion for it is the 
article by Professor Holzinger! in which he uses certain data of mine 
as follows: 


By way of illustrating these points, we may take an example from Professor 
Kelley’s book (p. 97ff.). The four tests used are: 


X, = Reading speed. 

X~2 = Arithmetic power. 

X; = Memory for words. 

X.4 = Memory for meaningful symbols. 


The intercorrelation and tetrad differences are as taken from a paper by the 
writer.? 





| X, | X: | X; X 











X.. Res teach t 0586 | .1950 | 2969 
Xe... wate) OE. < aman | 1487 | 2489 
EE, a set re | Sackthes .6693 

toss —_ .010 + .037. 

tiess = — .005 + .037. 

lisse = .005 + .016. 


From the insignificance of these tetrads we may conclude that factor pattern 
(1) with only g common is sufficiently complex. If group factors are present in 
these four tests their effect is insignificant, yet Professor Kelley employs the pattern 
given by the following portion of his Table XII (op. cit.). Numbers in the table 
are standard deviations. | 





1 Holzinger, K. J.: Thorndike’s CAVD Is FullofG. The Journal of Educational 
Psychology, March, 1931. 
2 Holzinger, K. J.: On Tetrad Differences of Overlapping Variables. Journal 
of Educational Psychology, Feb., 1929. 
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geneity am On 7? or Sonn icicle : 
Tests he verbal | number | memory | spacial | speed 
maturity, | tector | factor | factor factor |factor| Not 
sex, race | | chance \~bance 
| 
1. Reading speed..... .40 . 69 ses or 09 | .38 | .36 | .28 
2. Arithmetic power. . .21 | wee .63 aia i? per .16 | .66 
3. Memory for words. .66 .09 hie’ .56 wes sere .36 | .33 
4. Memory for mean- 
ingful aymbols.... . .59 fia bis 52 36 | ... | .82 | .89 














This happens to be a very false presentation of my argument and 
findings because the other elements in the problem have not been 
taken into consideration. It is totally unsound to investigate four 
variables, find that a single underlying factor is sufficient to account 
for the intercorrelations, and conclude that the same underlying factor 
(or any other, for that matter) would account for the intercorrelations 
of these variables when employed in connection with variables other 
than the four. Awareness of this is fundamental to a study of mental 
analysis, and that Holzinger has overlooked it both in connection with 
Thorndike’s interpretation of CAVD and my interpretation of my 
tests is unfortunate because it confuses the issues involved. 

A brief presentation of hypothetical data will, I hope, make the 
issue clear.. Let us be given the following eight tests and let them 
be totally accounted for by five underlying independent, 7.e., uncorre- 
lated factors labelled a, b, c, d, e, as shown in Table I. The constants 
are designated by literal symbols, other than z, with subscripts. The 
chance factors that ordinarily enter into measures have been omitted 
merely for the sake of simplicity of presentation. 


TABLE I 
Zi; = mia + nib 
Z2 = m2a + o2 
Zz = ma + pid 
Ie = MA + que 
Zs = ma + nsb 
Ze = + nob + oc 
a = + nb + prd 
Zs = + neb + qse 


With variables 1 to 8, due to underlying factors as given, it can 
readily be shown that the tetrads involving variables 1, 2, 3, 4, will 
equal zero, implying that one underlying factor is sufficient to account 
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for the intercorrelations. This is true but the intercorrelations 
accounted for are 712, 713, 714, 723, 724, 734, and none other. The correla- 
tion 7,, is not accounted for. Variable 1 has just as much ‘“‘right” 
to be considered in connection with variable 5 as in connection with 
variables 1, 2,3 and 4. Dr. Holzinger’s use of my data is equivalent 
to abstracting from a table, such as Table I, the first four rows only, 
and then pointing to the lack of evidence of factors b, c, d, and e. 
Certainly there is lack of evidence in the first four rows, but there is 
no lack of evidence in the table entire. 

Consider the data of Table I: All of the tetrads involving z,, 
X2, £3, t4 equal zero, and secondly all the tetrads involving 75, 2x6, 2:, 
xg, also equal zero. Clearly one underlying factor is sufficient to 
account for the correlations between the first four variables. Let 
us call this factor g. Clearly one underlying factor is sufficient to 
account for the correlations between the last four variables. Let us 
call this factor g. The next thing is to say that g = g. Asa bit of 
verbalism it sounds reasonable but is, in truth, absurd. The fact 
that one investigator finds a single underlying factor sufficient in a 
given set of tests, and a second finds a single underlying factor sufficient 
in a different set, is not evidence at all that the two factors are the same. 
In the Table I illustration they are entirely different, being an a factor 
in the first four tests, and a b factor in the last four. To generalize 
from the relationships found in four tests to relationships supposed to 
lie in the mind of man begs the question, for doing so involves the 
assumption that the four tests sample the entire mental life. There 
is no certain technique which reveals the latter, but the more extensive 
in number and varied in nature the tests employed, the more likeli- 
hood that the relationships found to underlie tests scores will reflect 
relationships in mental life. 
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THE GROWTH OF SOCIAL PERCEPTION IN 
DIFFERENT RACIAL GROUPS! 


W. N. KELLOGG AND B. M. EAGLESON 


Indiana University 


I. OUTLINE OF THE PROBLEM 


Of the numerous attempts to uncover racial differences between 
negroes and whites, the possibility of a distinction in the ability to 
interpret facial expression seems particularly fruitful. It may be 
argued for example that negroes should be less able to understand the 
emotional expression of white individuals because they have fewer 
opportunities as a class to observe whites in emotional situations. 
They would correspondingly be supposed to be superior, however, 
in comprehending the expressions of the negro countenance. On the 
other hand, an ability surpassing that of whites in the interpretation 
of emotional expression of all sorts might theoretically be ascribed to 
the negro as an outgrowth of the greater propensities for emotional 
behavior which he is generally supposed to possess. 

One of the most direct methods of approaching such a problem 
is to determine at what stage of development differences, if any, 
begin to appear. ‘This can be done by examining groups of children 
of each race, at varying age levels, equated as nearly as possible 
in respect to intelligence and social status. G. S. Gates” has already 
reported an investigation of the growth of social perception in four 
hundred fifty-eight white children from three to fourteen years old. 
Since her method is simple and the groups are carefully described we 
have taken her results as representative for white children, and have 
duplicated the experiment as nearly as possible upon similar groups of 





1 The writers wish to acknowledge the generous cooperation of the following 
persons, whose assistance in granting permission to enter the various schools and 
whose suggestions in the selection of different social groups are sincerely appreci- 
ated: Mr. Daniel T. Weir, Acting Superintendent of Indianapolis Public Schools; 
Mrs. Grace L. Brown, Superintendent of Indianapolis Kindergartens, Messrs. W. 
N. Grubbs and E. W. Diggs, principals respectively of Public Schools 24 and 42 of 
Indianapolis, and Mr. Anthony Courtney, Principal of the Banneker (negro) 
Public School in Bloomington, Indiana. 

* Gates, G. S.: An Experimental Study of the Growth of Social Perception. 
Journal of Educational Psychology, Vol. XIV, 1923, pp. 449-461. 

367 





pom 


fe 


Pt ee *'yavs 
ry bs ae 
a ——S eS 





} 
8 


Tai: 
eS a 


ae 


hie mee eee Ce ee A 
<i abe Rs Macaca BF, hess 


LS ie SS 





ee a en 











eee ae) eS aS Ce Fv. 





368 The Journal of Educational Psychology 


negroes. The technique adopted by Gates has been followed meticu- 
lously in order to make the findings validly comparable to those she 
reports. The only factor which remains uncontrolled in our procedure 
is the geographical difference in the locations of the negroes and the 


whites. That this has introduced no spurious influences will appear, 
we think, from the results. 


II. EXPERIMENTAL CONDITIONS 


Materials.—Six pictures of the face, head and shoulders of a 
woman in various emotional poses were selected from the series 
published by Ruckmick.! Typical interpretations given by adults 
to each of these pictures are as follows:” 


Picture A.—Laughter, joy or amusement. 
Picture B.—Pain. 

Picture C.—Anger or defiance. 

Picture D.—Fear or horror. 

Picture E.—Scorn, contempt or disdain. 
Picture F.—Surprise, wonder or amazement. 


The prints were presented “‘singly and individually”’ by a negress 
experimenter. Race and sex relationships between experimenter and 
subjects were thus equivalent to those between Gates’ subjects and 
experimenters. It should be noted, however, that the emotional 
pictures were identical in both studies and that they were poses of 
a white woman. 

Method.—After preliminary efforts to put the subject at ease, 
his name and age were asked. School grade and sex were recorded 
at the same time. The experimenter then said (1): “I am going to 
show you some pictures of a lady, and I want you to tell me what she is 
doing.” If no response, or if an incorrect or doubtful response was 
made the subject was asked (2): ‘What is she thinking about?” and, 
after a pause (3): “How does she feel?”’ Replies to these questions 
were recorded verbatim. 

The problem of grading and classifying the answers accurately, 
particularly those of the younger children is, as Gates has pointed out, 





1Ruckmick, C. A.: A Preliminary Study of the Emotions. Psychological 
Monographs, 1921, Vol. XXX, No. 136 (Critical and Experimental Studies 
in Psychology from the University of Illinois), pp. 30-35. 

2 Cf. either Ruckmick or Gates: Op. cit. 





u- 
he 
re 
he 


Ar, 


1es 
Its 


tely, 
out, 


logical 
tudies 


Growth of Social Perception 369 


a difficult one. We endeavored, however, to adhere strictly to the 
criteria established by her. Those responses listed as correct or 
incorrect for white children were similarly classified throughout this 
experiment. A few answers were received, however, which Gates does 
not report; in these cases “‘any interpretation which expressed the 
general trend of feeling in the picture was counted correct.”’ 

In scoring, the principle of giving the subject the benefit of the doubt 
was adopted. Thus when it had been necessary to ask two or three 
of the experimental questions the answer to one of which proved accept- 
able, the subject was allowed credit for his best answer unless replies 
to the remaining questions clearly indicated that he had no compre- 
hension whatsoever of the expression pictured. 

Subjects ——Three hundred thirty-two negro school children whose 
ages ranged from three to fourteen years were selected for this study 
from as widely varying social levels as were available in the vicinity 
of Indiana University. The total number employed was divided about 
equally between boys and girls. Ninety-three of the group, represent- 
ing all social strata, came from the only negro public school in the city 
of Bloomington, Indiana, a community of approximately 18,000 popula- 
tion. Of the remaining subjects, ninety-eight were taken from Public 
School No. 42 in Indianapolis,' which is located in a middle and upper 
class negro district. A third group, numbering one hundred eight, 
came from Public School No. 24 of Indianapolis, which was chosen 
as representative of the middle and lower strata of colored families. 
Children of ages three, four and five were tested in the Flanner House 
Kindergarten and Day Nursery and in Kindergarten No. 4, both of 
Indianapolis. A rough classification according to intellectual ability, 
as estimated by teachers, was obtained for all subjects six years and 
over in age. 

By these selections the authors feel that a fair sampling of various 


social and environmental levels of the typical urban colored population 
has been obtained. 


III. Resuuts 


Comparisons between Racial Groups——The summarized findings 
for all subjects, regardless of social or intellectual status are given in 





1 All the schools visited were exclusively negro institutions, including teachers 
and principals. 
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Table I. Figures of this table are the per cents of correct interpreta- 
tions, picture by picture, arranged according to the ages of the respec- 
tive subjects. 

Aside from minor variations the data indicate substantially 
the same facts brought out in Gates’ study of white children, namely, 


TaBLe I.—Per Cents or CoRRECT RESPONSES PER PICTURE FOR THREE HUNDRED 
Tuirty-TwoO NEGRO CHILDREN 


(All Groups Combined) 





> 
0g 
& 
ww 


4\/5|/1617181! 9110] 11) 12! 13 | 14 
| 


| 
.80) .87| .73) .83] .97)1.00) .88; 94 85) .91 


.45| .50| .68| .70! .86| .91! .72| .61! .74) .33 
.20| .47| .51| .60) .62| .70| .78| .87 88 .66 
.15} .03| .35| .37| .51) .67| .72| .74| .85) .74 
-00 .00|} .03| .03) .00} .00) .13| .16) .18) .17 


.00, .00) .00) .03) .11) .12) .25) .35) .29 .34 


20 | 30 | 37 | 30 | 37 | 33 | 32 | 31 34 | 35 





Picture A (laughter) 
Picture B (pain).... 
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Picture D (fear).... 
Picture E (scorn)... 
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(1) that the percentage of successful responses tends to increase with 
age regardless of the emotional expression judged, and (2) that the 
general order of perceptibility is, with occasional exceptions, as 
follows: Laughter, pain, anger, fear; surprise and scorn. The per- 
centages of successful responses of the negro children exceed those 
of the white children (as shown in Gates’ Table V, p. 460) more times 
in the interpretation of the Picture D (fear) than with any of the 
other expressions. It is doubtful, however, if any particular signifi- 
cance should be ascribed to this result. The findings on the whole 
are strikingly similar in the two experiments. 

One point of some significance which appears in the figures of 
Table I is the tendency of the percentages to drop for many of the 
pictures at ages thirteen and fourteen. An examination of the 
analogous data for white children brings out the same sort of decline at 
the upper ages. The drop is probably to be accounted for, we think, 
by the fact that the older children who remain in school (whether white 
or colored) represent the lower intelligence levels for their ages, and 
would consequently be expected to be inferior in social perception 
when compared with subjects slightly younger than themselves. 

In Table II we have recomputed Gates’ percentages for white 
children, combining the results for all six prints, 7.e., without considera- 
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tion for the emotion depicted. ‘These data are here paired with the 
similar figures of the negro children.! 


TasLE I].—PerR Cents or SuccessFuL RESPONSES OF WHITES AND NEGROES 
(All Pictures Combined) 

















eS ere 3) 4|5| 6 7\8\|9 10 | 11 | 12 | 13 | 14 
Whites (from Gates). | | | | | | 
Per cent......... .25| .26, .28| .35| .39| .39) .49| .58 .73) .75| .60| .77 
Number of cases! 10 | 40 | 85 | 59 | 55 | 58 | 39 | 28 | 44 | 27/17] 8 
Negroes. | 
Per cent......... .00) .15| .27) .31| .38) .43) .51) .57| .58| .61| .63| .61 
Number of cases} 5 | 8 | 20} 30 | 37 | 30 | 37 | 33 | 32 | 31 | 34 | 35 
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Fig. 1.—Growth curves showing scores of negroes and whites as given in Table II. 
The scores of the whites are not consistently superior to those of the negroes except 
at the younger ages where the white subjects were highly selected (see text). 


Disregarding for the moment, the percentages below year five, we 
see that in six of the remaining ten age-groups, the values obtained 
by the colored children either excel those of the whites or approach 
to within two points of the results of the whites. There appears 
therefore to be no obvious tendency for the samples of either race to 
be superior to those of the other in this ability.” The percentages 
for years three and four are not validly comparable since our figures 





1 The percentages were computed by dividing the actual successes, regardless of 
the type of expression judged, by the total possible successes (= the number of 
cases X 6). 

? The reliabilities of these differences were not obtained. 
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were obtained from average colored children while the majority of 
Gates’ subjects of these ages came from ‘‘the Kindergarten of a 
select private school.’ 

Intellectual, Sex, and Social Classifications.—Percentages in Table 
III (vide infra) were obtained in the same manner as those of Table 
II, except that the data were subdivided into three classes according to 
teachers’ estimates of the intelligence of the children. 


TaBLeE II].—Per Cents or Successes or Groups CLASSIFIED ACCORDING TO 
INTELLIGENCE 


(All Schools Combined) 
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ES pr ere ree 12 11 10 15 9 ll 12 12 10 
Group Z (inferior). 
| .30 | .29 | .33 | .45 | .58 | .46| .61 | .57) .54 
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Group Z, with few exceptions, is lower in social perception at 
all ages than either Groups X or Y. Group X shows little superiority 
over Group Y except in the later ages, namely, eleven, twelve, thirteen 
and fourteen. There thus appears to be a positive relationship of a 
crude sort indicated in our figures between social perception and 
intelligence. The failure of the X group to demonstrate its superiority 
over the Y group at the younger ages is probably to be explained as 
a function of the inaccuracy of our method of measuring the 
“intelligence” of these subjects. 

According to Gates: ‘‘Sex differences are not manifest and differ- 
ences caused by social status, if they occur, are slight.’”’ This applies, 
of course, to her experiment with white children. No figures are 
given to substantiate these statements. Our own results, however, 
indicate a rather pronounced sex difference, as shown in Table IV. 
The boys are superior to the girls only in the younger ages (where in 
most instances the number of cases is small). From seven to fourteen 
years, on the other hand, girls exceed the boys. This superiority is 
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to be explained no doubt as a result of the more rapid maturing of the 
girls. The female subjects, actually in a more advanced stage of 
development than males of the same age, would hence be capable 
of better performance in mental tasks. 


TasBLeE [V.—Per Cents or Successes or Necro Boys vs. Necro Grrus 
(All Schools Combined) 
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Fig. 2.—Growth curves showing relative scores of negro girls and negro boys as given 
in Table IV. Above age 6 the scores of the girls demonstrate a consistent tendency 
to be slightly higher than those of boys of the same ages. 


Differences between the major groups of the three public schools 
are indicated in Table V. Two rather striking facts appear from the 
data in this table: (1) The subjects from the Bloomington school are 
superior to those of either of the other schools in five of the possible 
nine age-groups; (2) the subjects from Public School No. 42 (presuma- 
bly from average to superior social levels) exceed the other two groups 
at only one age. 

A comparison of the two Indianapolis schools, year by year, dis- 
closes that neither is consistently superior to the other. The subjects 
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from these two schools were specifically selected, however, because 
of differences in environmental levels. It is indicated, therefore, from 
these figures that distinctions in social strata are of minor significance 
in the development of social perception. The general superiority 
of the Bloomington children over the other groups can be accounted 


TaBLE V.—PeER CENTs OF CORRECT RESPONSES FOR DIFFERENT SociaL Groups 
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for by the fact that in this small school two or three grades are kept 
together in a single classroom. The continual contact with those 
older than themselves, affords the children greater opportunity to 
become adept in social perception than would be possible if they were 
segregated in separate classes. 


IV. SUMMARY AND CONCLUSIONS 


Following the procedure adopted by G. 8. Gates for measuring the 
growth of social perception in white children, three hundred thirty-two 
negroes varying in age from three to fourteen years were similarly 
tested. Six pictures.of the Ruckmick series were presented one at a 
time to each subject who then attempted to indicate the emotion 
represented. i 

1. Regardless of geographical differences in the location of the 
whites and negroes under examination and the fact that children 
of both stocks judged facial expressions posed by a white woman, a 
striking similarity was evidenced year by year in the data for the two 
racial groups. 
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2. The growth of social perception in negro children was hence 
demonstrated to be the same in all major respects as that which Gates: 
found for white subjects. 

3. The negro girls were consistently superior to the negro boys 
except at the very young ages. 

4. A rough positive relationship appears between social perception 
as measured in this experiment and teachers’ ratings of intelligence. 

5. Groups of different social status manifest no superiority in the 
interpretation of facial expressions which can not be explained by 
factors other than their respective social levels. 
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ON KIN RESEMBLANCES IN PHYSIQUE VS. 
INTELLIGENCE 


HERBERT 8S. CONRAD* 


Institute of Child Welfare, University of California 


A number of studies have pointed out the similarity of familial 
coefficients of resemblance in physical and in mental traits. The 
inference, by analogy, is usually that the mental traits are inherited in 
the same way and to the same extent as the physical.‘ As a possible 
objection to this, Spearman long ago pointed out that the physical 
coefficients of resemblance are relatively unattenuated by the unrelia- 
bility of measurement; whereas the mental coefficients are probably 
considerably attenuated.'* One of our leading investigators in this 
field has recently found the resemblance of siblings in intelligence to 
equal a correlation of .60 (corrected for attenuation). He concludes:"” 


If we may accept Pearson’s results for the resemblance of siblings in eye color, 
hair color, and cephalic index (.52, .55, and .49), and regard .52 + .016 as the 
resemblance in traits entirely free from environmental influence, we may infer that 
the influence upon intelligence of such similarity in environment as is caused by being 
siblings two to four years apart in age in an American family to-day is to raise the 
correlation from .52 to .60.T 


This conclusion, while possibly correct, can hardly be accepted 
solely on the evidence presented by its author. The measure of 
intelligence employed, consisted of ‘‘a selection of tests from standard 
instruments—the Institute of Educational Research Tests of Selective 
and Relational Thinking, Generalization, and Organization.’’!% Now, 
if the author had taken a single test from his total intelligence battery, 
and compared the correlation of siblings in this test with the correlation 
of siblings in (say) eye color, the comparison might have been 
acceptable. Eye color is, apparently, a relatively simple trait, deter- 
mined by a small number of genes; and perhaps the score in a single 
test approaches genetict comparability; but the total score in an entire 





* The manuscript has benefited from the criticism of Dr. Robert C. Tryon, 
National Research Fellow at the University of California. 

+ Italics as in the original. 

t Throughout this paper, the word ‘‘genetic” is used in the biological sense of 
‘‘hereditary,” or ‘‘pertaining to genes,” never in the sense of “pertaining to 


development,” as in the term ‘‘genetic psychology.” 
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intelligence test is almost certainly too complex, for a comparison with 
eye color to be genetically significant. Most students of the constitu- 
tion of mental traits, from Spearman to Thomson to Kelley, appear 
agreed on at least one point: that the total score on an intelligence test 
represents a composite (with unknown weights) of several more-or-less 
intellectual abilities, or traits. The statistical effect of merging several 
component traits into a general, composite trait like “total intelligence- 
test score’’ is, in general, to increase—and, so far as the comparison 
with simple physical traits is concerned, to spuriously increase—the 
correlation between siblings in the general trait. A precise evaluation 
of the spurious rise in correlation may be secured by applying Spear- 
man’s formula for the correlation of sums and averages.® 

Let us, for example, suppose that the average correlation between 
siblings in each of eleven separate tests is .52;* that is to say, 


Ty, = 02, To = .O2, fan1 = 02, eee Trix; = Oo. 


This coefficient coincides with the average found by Pearson for 
certain physical traits (eye color, hair color, and cephalic index). 
Now let us add the eleven tests to form a single battery or “‘intelligence 
test.” What is the correlation between the siblings in this total 
battery? If the average intercorrelation between the separate 
tests (for the identical individuals tested on the several tests) be .60, 
and if the average of the ‘‘cross-correlations’”’ be .37—then, applying 
Spearman’s formula, we find that the correlation between the siblings 
in the total test is .60 (see Table I). Would this increase from .52 to 
.60 represent, as appears to have been argued, the effect of environ- 
ment; or would it rather represent a necessary statistical outcome of 
the combination of individual tests into a composite total score? 
Obviously, the coefficient of kin resemblance which one obtains for 
“‘intelligence’’ depends to a marked extent on the number of tests which 





* Throughout this illustration, all r’s as given are assumed to be freed from the 
attenuation due to unreliability of measurement. }* 

The notation which follows is somewhat similar to that of Kelley.’ 1 refers 
to Test 1 administered to the persons in the X-distribution; I refers to the same 
test administered to the siblings of the persons in the X-distribution; r;; is there- 
fore the correlation between siblings in Test 1. Similarly, rey is the correlation 
between siblings in Test 2; etc. In like fashion, rj;; would mean the “cross-corre- 
lation” between the scores of persons in Test 1, and the scores of their siblings in 


Test 2; rey; would have a similar meaning for the sibling cross-correlation in 
Tests 2 and 3; etc. See Table I. 
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are merged into the total “‘intelligence”’ score, and the intercorrelation of 
these tests. With a suitable battery of tests called “intelligence,” 


it seems that almost any degree of sibling resemblance might be 
obtained. * 


TaBLE 1.—TuHE CORRELATION BETWEEN SIBLINGS IN A TOTAL INTELLIGENCE 
Trest, DETERMINED FROM THEIR CORRELATION IN THE SEPARATE SUBTEsTs 


ab? pa 


T = - a —e 
(1482+... 11) (1411+... XD /a + (a? — a)F pe V/b + (b? — b)F Pq 


(1 +2+ --- 11) means the sum of the scores in the eleven tests, made by the 
persons in the X-distribution. 
(I + II + - - - XI) means the sum of the scores in the same eleven tests, made by 
the sibs of the persons in X-distribution. 
a = b = number of tests = 11. | 
Tpq = the average intercorrelation between the tests, for the persons in the X-dis- 
tribution = .60. 
7pg = the average intercorrelation between the tests, for the persons in the Y-dis- 
tribution = .60. 
Tq is assumed to equal pg, in this case, because the same tests are involved, only 
the sample receiving the tests differs nominally. (The sample from which 
T pq is obtained consists of the siblings of the sample from which 7,, is obtained.) 











Frog = the average of: ry t+ fin + fim + * +s + 7ix1 + Ta + Ten 
+ Tem +: * Tex 
Th + m-t ee te Pe + 
Tu + Tim + °° * T11x1 


(There are, in all, 11 coefficients of correlation with subscripts like 11, 2II, 3III, 
41V, etc., each of these correlations equals .52. There are 110 coefficients of 
correlation with subscripts like 1II, 1III, . . . 1XI, 21, @1II, . . . 2X1, 31, Sil, 
SIV, . . . 8X1, etc., the average of these 110 “‘cross-correlations,”’ by the assump- 
tions of this illustration, equals .37.) 











ayaa | ee) es | 
"oats... 1D... 20 ™ /11 + (121 — 11)(.60) ~/11 + (121 — 11)(.60) 


= .60 





* Some illustrations of this from experimental studies are available. Starch," 
for example, finds the average correlation between eighteen pairs of adult siblings 
in ten separate achievement tests to be .42; in five separate mental tests, the corre- 
lation is .38; but in all fifteen tests combined, the correlation is .73! Somewhat 
similarly, Jones® finds the correlation between single parent and single child to 
equal .548; but the correlation of mid-parent with mid-child (average score of all 
the children) is .651. Willoughby,?° making use of scores in single tests only, and 
correcting for attenuation, reports an average sibling coefficient of .42; this, as 
Terman remarks, is “out of line with (lower than) the results of other investiga- 
tors.”"6 These “other investigators” have, in general, used total intelligence test 
scores. 

+ Cf. Reference 6. 
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It is evident, then, that to measure a functional complex, like 
“intelligence,” and to assume this complex as a genetic entity,* 
may involve statistical difficulties. Of course it is impossible, in the 
realm of mental characters, to measure the strictly anatomical struc- 
tures to which the genetic entities most probably correspond. The 
point here is simply that the (different) ability complexes which are 
measured by (different) intelligence tests are not, in any likelihood, 
the functional counterparts of genetic entities. ‘‘Intelligence,” as 
measured, demands analysis. From this point of view, the mathe- 
matical studies of Rosenow,'! Spearman,'* Hull,’ and Kelley* would 
appear of a fundamental nature, and not as objets d’art.'* It is, 
however, unfortunate that these useful and stimulating mathematical 
contributions have not more specifically considered genetic problems 
and mechanisms. 

Another statistical misunderstanding deserves mention. In re- 
viewing the literature one occasionally finds the statement, or somehow 
secures the impression, that because the coefficient of correlation 
between kin is .20 (or .30 or .40, etc.), the evidence for kin-resemblance 
isstrong. In truth, however, a correlation of .20 or .30 or .40 is indica- 
tive not of strong similarity, but of fairly strong dissimilarity. A 
more legitimate assertion from these correlations, perhaps, is that the 
evidence for heredity is significant. As a matter of fact, however, 
at least in the realm of mental traits, nobody knows (except by doubt- 
ful analogy with physical traits) what correlation should be expected 
between kin, if heredity alone were the cause of resemblance. Nobody 
knows, because nobody yet knows (except very hypothetically) 
the mechanism of mental inheritance.t Serious argument from 





* By “genetic entity’ the writer means a single elementary trait which is caused, 
in part or in whole, by a specific genetic mechanism—whether unit-factor or 
multiple-factor. Thus, white forelock is a genetic entity (probably unit-factor); 
but handsomeness is not a genetic entity, because (like intelligence-test score) it is 
a complex trait composed of several individual (and arbitrarily weighted) entities. 

7 R. A. Fisher! and Sewall Wright?! have postulated certain coefficients of 
phenotypic resemblance between kin, on the basis of certain assumptions concern- 
ing the mechanism of heredity. Some of Fisher’s assumptions (particularly as to 
the number of genetic factors involved) may be doubtful in themselves; and Fisher 
himself admits that his whole set of assumptions may be over-simple (such possible 
phenomena as lethal factors, somatic mutation, environmental effect upon domi- 
nance, differential fertility of germs, sexual selection, and the correlation of heredi- 
tary and environmental factors are ignored). Wright gives the most complete set 
of predictions known to the writer; but these predictions are based upon definite 
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empirical correlation coefficients alone, therefore, seems as yet unjusti- 
fied. To give extreme examples for the emphasis of this point: 
The correlation between parent and offspring for trait X, in a pure, 
inbred, homozygous line, would (if the parental environmental factors 
are uncorrelated with the offspring’s) be zero.22, And yet this absence 
of correlation is consistent with a quite profound influence of heredity; 
which, indeed, could never be deduced from the zero correlation 
coefficient alone, even if statistical allowance were made for the restric- 
tion of range in an inbred population.”** Conversely, the correlation 
between tuberculous infection in parents and tuberculous infection 
in offspring may be quite high; but this high correlation is consistent 
with a negligible or even zero influence of heredity (in the strict sense 
of the word). Perhaps the most embarrassing situation arises when the 
correlation which might be expected through the exclusive influence 
of heredity, coincides with that which might be expected through the 
exclusive influence of environment. 

A great deal of intrinsic interest attaches to the relative magnitude 
of kin resemblances in intelligence, height, cephalic index, eye color, 
etc. The point seems to deserve emphasis, however, that merely 
biometric studies of kin resemblance (under uncontrolled conditions 
of environment), do not yet have the necessary psychological, genetic, 
and mathematical basis for anything like rigorous, quantitative 
interpretation of results. Experimental or pre-experimental studies 
of the type of Burks,”* Freeman,”* Muller,? Newman,'® Gesell,’ 
and Tryon!® seem indicated. 


Notge.—Discussion of the foregoing paper with Dr. Tryon has brought forth 
the following points: 


1. The rise in correlation attending the pooling of tests may not be spurious, 
in the sense that the rise may be interpreted in two ways: (a) One possible cause 
of the rise may be the closer balancing, in the pool, of factors uncorrelated between 
the siblings (this accords with Spearman’s two-factor theory of mental abilities). 
Whether either the general factor or the specific factors are hereditary or 
environmental or both, is not divulged by the correlation coefficients alone. (5) 
An alternative cause of the rise may be the fuller sampling of abilities in the 
pool than in any individual test (this accords with a multiple-factor theory). 





assumptions concerning both the mode of inheritance and the system of mating. 
In the case of human mental inheritance, ignorance of the mode of inheritance 
would appear to preclude confident use of any one of Wright’s formule. 

*Inasmuch as a zero coefficient remains unaltered by the correction for 
restricted range.” | 
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2. A difference between parent-child correlations in two psychologically 


unitary traits may argue as well for a different mode of inheritance as for a differ- 
ential effect of environment. 


3. In the comparison of correlation coefficients, due consideration must be 


given to the nature or make-up of the variables concerned. It is worth noting 
that the effect of pooling usually diminishes rapidly, as the size of the pool increases. 
This suggests the desirability of comparing either simple elementary traits, in 
which the effect of pooling is absent; or highly complex traits, in which the effect 
of pooling is fairly stable, and perhaps uniform. 


1. 


10. 


11. 


12. 


13. 


14. 
15. 


16. 
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A NOTE ON METHODS OF MEASURING RELIABILITY 


T. G. FORAN 
Catholic University of America 


Three procedures are available for measuring the reliability of 
tests. According to the first, the same form of the test is administered 
twice. The second involves estimating the reliability of the entire test 
or scale from the correlation of its halves. In the third method two 
forms of the test, each given once, furnish the data from which the 
measures of reliability are calculated. The third procedure is gener- 
ally preferred to the others. All methods involve obvious differences 
under which the reliability of a test is found. 

The data in this study were compiled for the purposes of comparing 
the first and third methods as described above. Four schools were 
used. In one, the first list of the Morrison-McCall Spelling Scale 
was given twice to all pupils from the Grades II to VIII, inclusive. 
In the second school, the second list of the scale was used twice. In 
the third school, the first and second lists were used in that order. 
In the fourth, the second and first lists were used. In all grades of the 
four schools, a day intervened between the first and second tests. The 
rules for administering and scoring the tests stated in the scale were 
followed precisely and all computations were performed twice on a 
calculating machine. 

Identifying information, means, standard deviations, and several 
measures of reliability are presented in Table I. The averages of 
the reliability coefficients are higher when the same form is repeated 
than when duplicate forms are employed. All reliability coefficients 
with one exception are higher when either List 1 or List 2 is repeated 
than when the two lists are employed and comparisons are made 
between the same grades in the four schools. Nine of the twelve 
probable errors of score are lower for the single form method. Further 
confirmation of these results is obtained from the probable errors of 
estimated true scores. 

In order to render the reliability coefficients directly comparable 


through holding the variability constant, each coefficient has been 
used with Kelley’s formula: 


¢_vV1-R 
= wWYfl-—r 
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with = or the larger standard deviation arbitrarily set at 10. In 
nine of the twelve possible comparisons of the same grades in the 
four schools, higher coefficients are secured from the two applications 
of the sameform. In one of the three comparisons in which the reverse 
occurred, the difference is negligible. 

In Table II the measures of reliability have been averaged. With 
three grade-groups and two schools for each method, six measures 
contribute to each of the averages. The difference between the 
means of the coefficients of reliability appears to be considerably 
greater than between the R’s. A better comparison is possible through 
the measures of improvement over chance. 










































































Same form | Duplicate . 
: P Difference 
twice forms 
| 
ee a rea TREE RANT e -Lare y e Geyer ene .935 . 866 
Oe ie ek al og i a ana eek el 64.5 50.0 14.5 
iia ae a ee ee ie .969 951 
eee tad ob ak ous ee ae Mae ea eel ee 75.3 69.1 6.2 
| 
TaBLE I.—MEASURES OF RELIABILITY OBTAINED WITH DIFFERENT 
ARRANGEMENTS OF TESTS 
| | Fa) | | 
Order of tests Grades| N | Me | Ms | SDo | SDs} + | R* | PEscore?| PEest.ts 
eS Tk NE BK 
ae Ae ee 2,3,4| 95/19.12'20.60 6.54) 6.68 911) .012).961, 1.33 1.27 
 TRECTERE Rae are | 5, 6 66/32.66 35.51 8.20 7.59 .930).011|.956, 1.41 1.36 
ee eee i¢.8 66/43.30.44.91) 4.77| 4.30 .933 .011).986 .79 76 
a meee eh! Hp | aioe . | ...-|.925|....1.968| 1.18 1.13 
} 
Bc a oe 2,3,4| 80.20.98/21.82) 6.99) 7.50 972) .005 985 .82 81 
OAT eee | 5,6 | 5632.60/35.20 8.07| 7.51 .897).017,.938 1.69 1.60 
Neo ag oe 17,8 53,40.06,41.07 6.22) 5.43 .968,.005).989 .70 69 
eed sl eicbed | as xfossns | a rene] ...-|-946]....1|.971| 1.07 1.03 
| 
ey eee | 2,3, 4) 243 18.93)/19.78 7.39) 6.71 .893 .008|.947, 1.56 1.47 
Sees. © 152 32.2431.24 5.90, 6.19 .883|.012).957, 1.39 1.31 
| are: 7,8 149 41.37.40.19 4.54 5.01 .838 .016/.963 1.30 1.19 
ee Be eo eee eee | ..ee| o-+|.872|....|.956| 1.42 1.32 
‘ ! | | 
j | | 
ee ene Te 2, 3, 4 | 125 18.86/20.18| 8.38) 8.88 .926 .009].945 1.58 1.52 
error. 5, 6 109 33.66/35.49 7.39) 6.15 .888).017|.948 1.53 1.44 
SSE ee 7.8 91 41.69/40.98) 4.62 5.15 .767).029|.944 1.59 1.39 
eee Bees See ita SAPS | ....] ....|-860}....1.046) 1.57 1.45 
—R 
1 R as obtained from Kelley's formula: ‘ = Mick with = = 10. 
= — 


2 PEgcore = .6745 04/1 — r. 


® PEest.t.s. = probable error of estimated true score = .67450+/r — r?. 





In 
the 
ons 
PTse 


Vith 
ures 

the 
ibly 





woe 


1.52 
1.44 
1.39 
1.45 


Measuring Reliability 385 


TaBLE II.—Means or Srx MEAsuRES OF RELIABILITY FOR Each ARRANGEMENT 
oF TESTS 





| Same form | Duplicate 
| twice | forms 
| (1-1; 2-2) | (1-2; 2-1) 





| 
.935 | . 866 


cia 6s cate cease nana chow 

ETC Ae rrr ee eee . 969 951 
RR Te Rd Fe a 1.12 | 1.49 
ED Nae ie) aN 1.08 | 1.39 
Total number of pupils........ 416 | 869 





The improvement over chance has been computed from the 
formula:! 


I, = 100(1 — V/1 — r?) 
While the means of the reliability coefficients over-emphasize the 
difference between the methods, a substantial difference remains 
when variability is held constant. 


Table III contains the significance ratios for the combinations of 
reliability coefficients. The significance ratio is the difference between 


TaBLE III.—SiIGNiFICANCE RaTios OF DIFFERENCES BETWEEN RELIABILITY 
COEFFICIENTS 











Grades | 
Tests $$ | Mean 

I, WI, IV) V, VI | VIE, VIET 
IS sen sista anBodvel —4.69  +1.65 | -—2.92 3.09 
| Es ee —2.75  —-0.24 | 42.15 1.71 
9 iin 1 Ware tear oe +1.29 | +2.94 | +5.00 3.08 
Es in 8keeetanngekicns —1.00 | 42.10 | +5.35 2.82 
En «econ duwaedtwndias +8.78 | +0.67 | +7.64 | 5.70 
IS Ss, dhe’ tues ae eB +4.60 | +0.38 | +6.93 | 3.97 








the coefficients divided by the probable error of the difference. The 
probable errors of the difference have been found by means of the usual 
formula :? 


PEgis. = ~/ PE’,_; + PE?» 





? Holzinger, Karl J.: ‘Statistical Methods for Students in Education.” Ginn 
and Co., New York, 1928, p. 166. 


? Garrett, Henry E.: “Statistics in Psychology and Education.” Longmans, 
Green and Co., New York, 1926, p. 171. 
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The significance ratios are positive when the first coefficient of the 
pair is higher than the second. The means of the ratios have been 
found without regard to the signs of the ratios. It is noteworthy 
that the order of the tests has no influence on the results when the 
tests are similar forms. The most important differences occur when a 
reliability coefficient obtained from the same form used twice is com- 
pared with the reliability coefficient from similar forms. The signifi- 
cance ratios are much larger for Grades VII and VIII combined than 
for Grades V and VI together. In general, reliability coefficients 
from repeated tests are significantly higher than the same measures 
found from one use of each form of the test. 


TasBLeE I1V.—Practice EFFects AND EQUIVALENCE OF Lists OF Morrison- 
McCatuu SPELLING SCALE 








Cisies ad | | Difference 
Grades ie | le eee Ee 
| | + | - 

ae ee | Mh | 95 1.48 | 
io Ne tk ou tale | it | 66 2.85 | 

VII, VIII. | we 66 1.61 
ag igh ad wt eeaaen | “Be 1.98 
SE a 2-2 80 84 
ARR Herren er: 2-2 56 | 2.60 

VII, VIII. | 22 53 1.01 
BEET eee ere | 2 1.48 

Uo daha bk'sks edd ens | 1-2 243 | 85 
ee eee | 1-2 Reese te 
VII, VIII. 1-2 1 wt | oe 
So Pr oe a 1-2 ; | | .44 
EE ee ee a *! 125 | 1.32 

i ee De ch a eg | 22 | 100 |; 1.88 

VII, VIII 1 CO i ee 71 
ES eee ee ee 2-1 ef 81 | 





Difference is positive when second test has higher mean. Means are not 
weighted. 


Some measure of practice effect under the same conditions is 
possible from the data at hand. Table IV contains the differences 
between the means for all grades and schools. The practice effect 
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involved in taking the same test twice is considerable. It is greater 
for List 1 than for List 2, the means being 1.98 and 1.48 but this may be 
a sampling error. The algebraic mean of the differences between the 
two tests in the last six groups is +.19. There are some indications 
that List 2 is more difficult than List 1. When the order of the tests is 
1-2, the mean gain is .44 but when the order is reversed the mean gain 
is .81. 


CONCLUSIONS 


1. When allowances are made for differences in variability, 
reliability coefficients are higher for repetitions of the test than for 
similar forms. 

2. The order in which the tests are given has no effect on the results, 
even when one form is slightly more difficult than the other. 

3. The practice effect of taking the same test twice is estimated 
to be 1.5 times the practice effect from duplicate forms. This obtains 
only for the Morrison-McCall Spelling Scale. 

4. In view of the results of this study, it is important to consider 
the conditions underlying the determination of test reliability. 

5. Using the method of two forms, the probable error of estimated 
true scores of the Morrison-McCall Spelling Scale (and probably 
for any well scaled spelling test of fifty words) is 1.4 words. 


so weep aga eases" 








SAMPLING ERROR OF TETRAD DIFFERENCES 


C. SPEARMAN 


University of London 


In the now prevalent usage of the formula for the probable error 
or variance of tetrad differences,' there is apt to be scant account taken 
of the special assumptions upon which this formula is based. One 
of these, namely that the variables should have a normal frequency 
distribution, is only dangerous when great accuracy is wanted.” But 
the other assumption can do much more harm; it is that the correla- 
tions are measured by the ordinary product moment coefficients, 
so that the variance of the coefficient 


o,? = (1 — r?)?/N. 


One case where this latter assumption may be far from justified is 
that of coefficients which have been corrected for attentuation. And 
even worse is the case of coefficients derived from any of the customary 
four-fold tables. On the other hand, the coefficients derived from 
ranks (with the differences squared) would not seem to introduce any 
disturbance of appreciable size. 

Of course, there is nothing to prevent anyone from so modifying 
the formula for the tetrad sampling error that it does become valid 
for any coefficient required. All that has to be done is that in deriving 
the formula the above quoted o,? should be replaced by the variance 
which is appropriate to that coefficient. Such variances can be found 


in any good textbook; for example, in those of Holzinger, Garrett, or 
Kelley. 





1 See the present writer’s “‘ Abilities of Man,” appendix, pp. x—xi (except that 
on p. x, N should be replaced by N”2). 
2See this JourNnAL, Nov., 1930, Disturbers of Tetrad Differences. 
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NEW PUBLICATIONS IN EDUCATIONAL 
PSYCHOLOGY AND RELATED FIELDS OF 


sate EDUCATION em 


CONDUCTED._BY FRANCES M. FOSTER 











Abnormal Psychology, Its Concepts and Theories, by H. L. Hollingworth. 
New York: The Ronald Press Company, 1930. Pp. XI + 590. 


A very distinctive as well as significant contribution to the field of 
abnormal psychology is Dr. H. L. Hollingworth’s latest book. This 
work represents a serious effort to make the concepts and theories 
concerning mental abnormalities psychologically intelligible. It 
also represents a determined effort to systematize these concepts and 
theories. The system constructed has a logical place in it for facts of 
consciousness along with behavior but the evaluations and reinter- 
pretations of materials selected for presentation can hardly be said 
to be characterized by catholicity. Most viewpoints are evaluated 
but the psychoanalytic—labelled by Hollingworth, the psychoana- 
logical—is vigorously attacked. His own conceptual system is con- 
sistently applied. In this system the general pattern of mental 
activity, abnormal as well as normal, is said to be characterized by the 
redintegrative sequence. ‘Partial stimuli now occurring function 
for former antecedents of greater complexity.’”’ Learning and sagacity 
are the two important aspects of these processes. ‘‘ Without learning 
there could be no mental activity; without sagacity, mental activity 
becomes biologically and socially ineffective.’”’ Feeblemindedness is, 
accordingly ‘‘relative inaptitude for redintegrative activity.”” ‘‘ Unsa- 
gacious redintegrated responses give the picture of the psychoneuroses. 
On the postural level such symptoms give the somatic, physical picture 
of hysteria; on the autonomic level, they comprise the complaints of 
the neurasthenic; on the symbolic level they show themselves in fixed 
ideas, obsessive thoughts, morbid pictures and ruminations, as in 
psychasthenia.”” Psychotherapy consists in teaching the patient to 
substitute a symbolic for a postural or an autonomic response. Lack 
of space prohibits a fuller consideration of the fundamental con- 
ceptions in this system. They are the same concepts which the author 
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has previously used in reconstructing the facts and principles in general 
psychology. And the reconstruction is practically as thoroughgoing. 

In the Preface the author differentiates his own interest in mental 
abnormality as a student of natural phenomena from the ‘‘everyday” 
interests of educators or hygienists and the ‘‘technological’”’ interests 
of psychiatrists or clinical psychologists. This implies to him a 
dominant interest in concepts rather than clinical pictures, ideas 
rather than cases, principles rather than procedures; in short, an 
interest in understanding rather than preventing or relieving mental 
abnormality. Many cases are presented—more than in some of the 
books which emphasize clinical pictures. The book really contains 
more practical features than the foregoing statements imply. The 
chief reason, of course, is that in dealing with problems of human 
behavior the point of demarcation between diagnosis and treatment 
is hardly perceptible—the goal from the beginning to end can be said 
to be fuller understanding. 

The order of presentation as well as the organization of the mate- 
rials in each of the twenty-five chapters is apparently the result of 
much conscious planning. The outline is clear and the march con- 
tinous. The book begins with a succinct chapter on the subject- 
matter and uses of abnormal psychology. Abnormal behavior is here 
differentiated from desirable behavior. ‘‘It may be comfortable,” 
we are told, ‘‘to be normal, since this will mean that we are not con- 
spicuous; but normality, in the scientific sense, is nothing on which to 
pride oneself.”’ 

Following the introductory chapter are six very comprehensive 
chapters describing historical and contemporary viewpoints. The 
only one that can not be passed by without comment is the one which 
purports to give a fair account of psychoanalysis. There is no deny- 
ing the validity of a good deal of the criticism. Nor can one deny 
Hollingworth’s greater clarity. But one is tempted to add that at 
least some of the clarity is obtained by ignoring fundamental diffi- 
culties which confront psychologists who are more concerned with 
people rather than concepts. 

In the consideration of personality deviations the author first 
attacks the simpler and more measurable deviations, advances to the 
“Dwellers of Neurotica” and thence to the psychotics and other more 
complex deviations. ‘Two chapters are given to the feebleminded; 
nine to the psychoneuroses and one chapter a piece to each of the 
following topics: Stage-fright and dream; stuttering and stammering; 
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aphasia and asymbolia; epilepsy; constitutional psychopathic states; 
personality types and functional psychoses; and mental disorder and 
the effect of drugs. 

Objective studies of the neurotic and the feebleminded are fully 
treated. In the discussion of the neuroses more than the usual amount 
of attention is devoted to the contributions of Babbinski, Hurst and 
Rosanoff as well as those of Janet and Prince. The psychoanalytic 
contributions of Freud, Adler and Rivers are presented as Herbartian 
conceptions of the neuroses. The result is interesting reading but 
can hardly be considered as more than a forced interpretation; a 
waste-basket variety of categorizing. Hollingworth’s explanation of 
primitive behavior in terms of redintegrative sequence is no more 
realistic than Freud’s which he criticizes—neither is based on actual 
anthropological studies. No more convincing is the description of all 
Freudian forgetting as ‘‘underlearned material.”’ 

A critical evaluation of fictional neuro-anatomy, a valid criticism 
of Cotton’s overvaluation of the réle of infection in mental disorders, 
a much needed emphasis on the réle of the learning process in the 
development of the functional disorders, a healthy emphasis on the 
how as well as the what of the thinking of the contributions con- 
sidered—these, in the opinion of the reviewer, are among the most 
pleasing as well as distinguishing features of the book. 

H. MELTzeEr. 
Psychiatric Clinic, Saint Louis, Missouri. 





The Guidance of Mental Growth in Infant and Child, by Arnold Gesell. 
New York: The Macmillan Company, 1930. Pp. XI + 322. 


If parents and educators more frequently combined their avowed 
love and respect for little children with the sympathetic point of view 
gained by scientific study of mental development, many child adjust- 
ment problems would be solved. Dr. Gesell shows a rare appreciation 
and respect for childhood’s complexities, and he beholds with wonder- 
ment the orderly progression of mental unfoldment which his own 
techniques, skillfully applied, have revealed. He shatters many an 
illusion and yet builds up our faith in the possibilities of solving behav- 
ior problems. If in his enthusiasm the author verges at times on the 
sentimental and stresses the obvious, these tendencies do not detract 
from the soundness of the principles discussed. There is occasional 
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repetition due to inclusion of previously published articles which 
overlap in content. 

\Children have not always been so sympathetically understood, 
but the growing tendency toward clearer conceptions of the purpose 
of childhood and parent-child relationships is revealed in the accounts 
of early nursery school projects and in the correspondence of intelligent 
mothers of whom Susanna Wesley was an illustration. Older con- 
cepts of child guidance are illustrated with an amusing and charming 
series of old prints and lithographs. The modern nursery school with 
its attendant program of scientific child study is the newest addition 
to institutional education and promises to integrate the activities of 
home and school of children below kindergarten age. Instead of the 
kindergarten giving formal instruction based on dubious psychological 
principles, we find the modern kindergarten assuming an important 
role in child guidance, forming a continum with the nursery school and 
the elementary grades. 

Concerning the mental development of the child as determined by 
the ingenious methods of the Yale Psycho-clinic, the author has many 
significant findings to report. The most important principles formu- 
lated are those concerning the rapidity, complexity and orderliness of 
mental growth from birth to maturity. 

Dr. Gesell formulates normality of mind in terms of: (1) Whole- 
some personal habits of living; (2) wholesome habits of feeling; and 
(3) healthy attitudes of action. 

Concerning the part played in development of the two factors of 
heredity and environment there has been a great deal of misunder- 
standing. Dr. Gesell’s treatment of the topic is one of the sanest 
and most interesting of existing reports. He finds that the child 
is remarkably well insulated against chance environmental influences 
and conditioning and that much of what in the young child appears to 
be the result of training and teaching is really the result of mental 
unfoldment and natural maturation. Because the child is a human 
being certain behavioristic trends are bound to assert themselves in 


spite of even decidedly adverse environmental conditions. Physical | 
handicaps do not drastically alter the behavior capacities of the. 


child. The inevitableness and surety of maturation safeguards the 
child against adventitious circumstances. 

The validity of the conclusions reported is well established and 
their application in child guidance work will be unquestionably 
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beneficial. Every one who has child training responsibilities will 
find Dr. Gesell’s book interesting and valuable. 


GERTRUDE HILDRETH. 
The Lincoln School of Teachers College, Columbia University. 





THE REVIEW OF EDUCATIONAL RESEARCH 


The first issue of the first journal to attempt a systematic review 
of educational research has just appeared. The Review is the official 
organ of the American Educational Research Association which is now 
a Department of the National Education Association. 

The first issue deals with the curriculum. It was prepared by a 
committee consisting of Dr. Henry Harap, Dr. William L. Connor, and 
Dr. Ralph W. Tyler, assisted by several other persons. This issue 
includes an extensive classified bibliography and treats of methods 
of curriculum-making, studies of the objectives of the curriculum, 
studies of learning which bear upon the curriculum, time allotment 
and grade placement, etc. Four other issues dealing with teacher 
personnel, school organization, special methods up to the end of the 
elementary school, and individual differences and psychological tests, 
are scheduled to appear during the year. The whole field of research 
has been divided into fifteen topics which will be covered in three years. 
Three numbers will appear during the spring and two during the fall. 
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